The agenda for Spark Summit East 2015 (18-19 March) in New York City has just been published. I’ve listed the topics being covered and highlighted the ones that piqued my interest.
Developer stream
- Beyond SQL: Spark SQL Abstractions For The Common Spark Job
- Streaming Big Data Analytics with Team Apache: Spark & Spark Streaming, Kafka and Cassandra
- Spark User Concurrency and Context/RDD Sharing at Production Scale
- Power Hive with Spark
- Spark Application Carousel: Highlights of Several Applications Built with Spark
- GraphX: Graph Analytics in Spark
- Experience and Lessons Learned for Large-Scale Graph Analysis using GraphX
- Towards Modularizing Spark Machine Learning Jobs
- Spark Streaming—The State of the Union and the Road Beyond
- Using Spark and Elasticsearch for real-time data analysis
- Accumulo and Spark: Geospatial processing with more distribution, less shuffle
Applications stream
- Spark Plugs Into Your Car
- When Spark meets Baidu
- Plot all the data—Interactive visualization of massive datasets
- Real-Time Recommendations using Spark
- Estimating Financial Risk with Spark
- Spark’ing an Anti Money Laundering Revolution
- Recommendations in a Flash: How Gilt Uses Spark to Improve Its Customer Experience
- Graph-Based Genomic Integration using Spark
- Geospatial and Temporal Analysis and Visualization
- SILK: A Spark Based Data Pipeline to Construct a Reliable and Accurate Food Dataset
- Finding Shoe Stores in more than 100k Merchants: Using Apache Spark to group all things!
Data science stream
- Spark Infrastructure for Lumiata’s Probabilistic Graphical Model of Medical Science
- Un-collaborative filtering: Giving the right recommendations when your users aren’t helping you
- Distributed Graph-Based Entity Resolution Using Spark
- Functionality and Performance Improvement of SparkR and Its Application
- Practical Machine Learning Pipelines with MLlib
- Streaming machine learning in Spark
- HeteroSpark: A Heterogeneous CPU/GPU Spark Platform for Deep Learning Algorithms
- Multi-modal big data analysis within the Spark ecosystem
- Visualizing big data in the browser using Spark
- Interactive Scientific Image Analysis and Analytics using Spark
- Next-Generation Genomics Analysis Using Spark and ADAM