tech blog

AppNexus is today’s most powerful, open, and customizable ad tech platform. Advertising’s largest and most innovative companies build their businesses on AppNexus.

Taming Big Data

| Comments

The Data Platform Engineering Team at AppNexus has been utilizing Hadoop in production for the last 4+ years. Growth of data volume as well as the number of customer use cases supported by Hadoop’s infrastructure has grown exponentially since we adopted the Hadoop stack. More specifically, in 2012, we were processing 10 terabytes of data per day, today we process over 170 terabytes per day.

We evaluated various commercial and open source solutions to reduce our storage foot print, improve Hadoop utilization, and unlock YARN’s multi-tenancy promises.

Our talk covers:

  • Architecture of AppNexus Data Platform: Data ingestion, processing, and how our customers consume the data.
  • Complex use cases supported by MapReduce, Vertica and Spark streaming.
  • Overview of how we are offering our Data Platform As A Service to AppNexus’ business units where teams can build and manage their own YARN application deployments.
Watch the video to learn how the AppNexus Data Platform Engineering Team has utilized Hadoop in production for the last 4+ years. Join our NYC Tech community MeetUp to stay up to date on future tech talks: http://www.meetup.com/TechTalks-AppNexus-NYC/ AppNexus’ advertising systems generate over 175 TB of data every day, which makes running a mission-critical data pipeline with tight SLAs an extremely challenging endeavor. Learn how the team evaluated various commercial and open source solutions to reduce our storage foot print, improved Hadoop utilization and unlocked YARN’s multi-tenancy promises. Learn more about

Comments