AWS Launches Amazon AthenaNovember 30, 2016
Amazon Web Services, Inc. (AWS), an Amazon.com company (NASDAQ: AMZN), today announced Amazon Athena, a serverless query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL. With a few clicks in the AWS Management Console, customers can point Amazon Athena at their data stored in Amazon S3 and begin using standard SQL to run queries and get results in seconds. With Amazon Athena there are no clusters to manage and tune, no infrastructure to setup or manage, and customers pay only for the queries they run. Amazon Athena scales automatically – executing queries in parallel – so results are fast, even with large datasets and complex queries. To get started with Amazon Athena, visit https://aws.amazon.com/athena.
AWS analytics services like Amazon Redshift and Amazon EMR have made petabyte-scale analytics accessible to companies of all sizes. With Amazon Redshift, customers can perform complex queries on massive collections of structured data and get superfast performance. For unstructured data, Amazon EMR makes it fast and cost-effective to process and analyze vast amounts of data across dynamically scalable clusters using popular distributed frameworks like Apache Spark, Presto, Hive, and Pig. While these services are scalable and powerful enough to handle the largest and most complex big data applications, many customers also want to be able to very quickly run queries on data stored in Amazon S3 (e.g. web logs, clickstreams, and raw event files) without having to spin up, configure, and manage a Hadoop cluster or a data warehouse. Now, with Amazon Athena, analyzing data stored in Amazon S3 is as simple as writing SQL queries. Amazon Athena uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, and Parquet. And, while Amazon Athena is ideal for quick, ad-hoc querying and integrates with Amazon QuickSight for easy visualization, it can also handle complex analysis, including large joins, window functions, and arrays. Because Amazon Athena executes queries using compute resources in multiple Availability Zones and uses Amazon S3 as the underlying data store, it is highly available and durable with data redundantly stored across multiple facilities and multiple devices in each facility.
“Over the past few years, AWS has built a comprehensive set of big data services that customers use to do everything from real-time analytics on streaming data, to petabyte-scale data warehousing, or Spark and Hadoop jobs – and it’s all fast, scalable, and cost-effective,” said Raju Gulabani, Vice President, Databases, Analytics, and AI, AWS. “For hundreds of thousands of customers, Amazon S3 is their primary data store – holding billions to trillions of objects. Customers have frequently asked us whether we could make it easy for anyone to run queries on their data in Amazon S3 without having to worry about provisioning or managing servers and clusters. Now they can. There is absolutely zero admin with Amazon Athena – anyone who can write a SQL query can analyze their data in Amazon S3. Amazon QuickSight and Amazon Athena are tightly integrated, enabling customers to visualize their Amazon Athena query results without even writing a SQL query.”
“We are long time customers of AWS, and use services like Amazon Redshift and Amazon EMR to support and power analytics across the company,” said Paul Cheesbrough, Chief Technology Officer, News Corp. “We received early access to Amazon Athena, and it has proven to be fast, easy to use, and cost effective. We’ve had great feedback from our teams of engineers and analysts, especially on Amazon Athena’s ability to query directly from Amazon S3, and we’re excited about where we go next with the service.”
LiveIntent, a platform for people-based marketing and advertising focused on the email channel, helps over 1,100 brands deliver marketing and advertising to 145 million people in emails sent by 1,300 top Publishers every month. “The LiveIntent platform collects and processes hundreds of millions of events per day. We are continuously challenging ourselves to build and extend the platform to provide faster and cheaper access to data, which in turn translates to better and faster insights for our customers,” said Eric Raab, Executive Vice President of Engineering, LiveIntent. “We found Amazon Athena to be faster and cheaper than any other solution we evaluated and decided utilize its capabilities right away. We really like that Amazon Athena has zero administration, and that we can query a multitude of formats directly from Amazon S3 with no loading required.”
DataXu helps marketers understand how marketing investments can lead to profitable customer relationships using data. “We process 3M+ bid requests per second, which results in a total of 3PB of incoming data every day. Even with compression and reduction, this results in 180+ Terabytes of logs per day,” said Yekesa Kosuru, Vice President, Engineering, DataXu. “We started using Amazon Athena as soon as we heard about it and are loving its simplicity, speed, and pay-per-query pricing model. Amazon Athena provides us with the ability to query our entire data set stored on Amazon S3, without the need to manage infrastructure. Because there’s nothing to manage and we only pay per query, we’re actively deploying Amazon Athena throughout the company.”
Gunosy is a leading Japanese provider of news curation apps. “We began using Amazon Athena as soon as we could and were impressed that even in preview Amazon Athena was faster than the system we had been using – even though it’s querying data directly from Amazon S3,” said Yosuke Abe, Data Scientist, Gunosy. “We’re actively migrating workloads to AWS so we can put Amazon Athena at the core of our analytics platform.”
Inrix is a leading provider of real-time traffic intelligence for enterprises, public sector, and media. “At INRIX we ingest terabytes of road network and movement data on a daily basis and run hundreds of Amazon EMR data pipelines to process it. We use Amazon S3 as a repository for our un-processed, in-process, and processed datasets. Our data scientists need to slice, dice, and analyze this data to build mathematical models of predictive analytics on road networks. Our data engineers need the ability to drill down from processed data to in-process data for monitoring and debugging data quality issues,” said Harsh Shah, Group Engineering Manager, Inrix. “We jumped at the opportunity to try Amazon Athena and loved the speed, ease of use, and flexibility offered by Amazon Athena. With Amazon Athena, any of our developers can query all of our data stored on Amazon S3 using SQL, without worrying about infrastructure or knowledge of big data processing systems. Amazon Athena has enabled us to quickly turn Amazon S3 into our data lake.”
Japan Taxi, a transportation app, has two million active users every month. “The ability to put data into Amazon S3 and query it just using standard SQL with Amazon Athena is incredible,” said Kazuhiri Iwata, Chief Technology Officer, Japan Taxi. “With Amazon Athena, we don’t have to load the data since the service can query the data in place. Now, any of our developers can query data at its most granular resolution, at low costs – enabling us to give everyone who needs it easy access to our data. Because Amazon Athena uses open source formats, we can also use other solutions like Amazon EMR on the same data, making interoperability easy. And, because Amazon Athena requires no administration, we were able to get started immediately.”
mParticle allows mobile app developers to collect and make sense of their data. “At mParticle we collect and process large amounts of data. We want all of our customers to be able to process raw data with simple languages such as SQL,” said Michael Katz, Chief Technology Officer, mParticle. “We jumped on Amazon Athena as soon as we heard about it, as the ability to quickly analyze large amounts of data using standard SQL appealed to us. With Amazon Athena, we got started immediately, paid by the query, and queries ran quickly. We liked the ANSI-SQL compatibility and that it can query both text and columnar formats.”
Nasdaq’s technology powers more than 70 marketplaces in 50 countries, and 1 in 10 of the world’s securities transactions. “Built on a vision of innovation and a heritage of disruption, we are always looking for new ways to improve efficiencies and gain new insights across business areas within all of our markets. Given that data is critical to the success of our business, we are always interested in new tools to analyze the data we have stored in Amazon Redshift, Amazon S3, and other sources,” said Nate Sammons, Principal Architect, Nasdaq, Inc. “We wanted to extend our Amazon Redshift data warehouse and build a secure, cost effective long term data store. We chose Amazon S3 for storage and Presto as part of the query and analytics system because of its ANSI-SQL compatibility and fast performance. We expect Amazon Athena will help us take that idea even further by eliminating the need for clusters and allowing all of our analysts to query data in Amazon S3 at fast speeds.”
JW Player, one of the world’s most popular video player and a leading digital and mobile video solutions company, is live on more than 2 million sites across all devices — OTT, phones, tablets, and desktops — with more than 1.3 billion unique monthly views. “We use a combination of platforms to power the JW Analytics Dashboard, which provides analytics to measure content performance across large data sets. We regularly ingest 4+ billion events per day and are always looking for solutions that simplify processing large data sets, while reducing cost and complexity,” said Rick Okin, Vice President of Engineering, JW Player. “Amazon Athena provides us with an easy to use, fast and cost-effective solution with zero-administration. We love the fact that we can just put our data in S3, use open formats such as Apache Parquet to allow interoperability with the rest of our stack, and run SQL queries, without worrying about clusters or data warehouses.”
Tableau helps people see and understand data. “Our mission is to put data in the hands of as many people as possible so they can act on it and have an impact on the world around them,” said Andrew Beers, Chief Development Officer, Tableau. “We’ve partnered with AWS for a long time and have native integrations with Amazon Redshift, Amazon EMR, and Amazon RDS. We’re excited to announce support for Amazon Athena as well. Using Tableau and Amazon Athena together, customers can visualize all their data in Amazon S3 interactively, cost-effectively, and with no infrastructure to manage.”
Customers can start using Amazon Athena using the AWS Management Console. Amazon Athena is currently available in the US East (N. Virginia) and US West (Oregon) Regions, and will expand to additional Regions in the coming months.