Download Report

Making big data simple with Databricks
About Databricks
Enterprises have been collecting
ever-larger amounts of data with the
goal of extracting insights and creating
value, but are finding out that there
are many challenges in their journey
to operationalize their data pipeline.
These challenges include, cluster
management, deploying, upgrading
and configuring Spark, interactively
exploring data to get insights, and
ultimately building data products.
Databricks’ vision is to dramatically
simplify big data processing. It was
founded by the team that created and
continues to drive Apache Spark, a
powerful open source data processing
engine built for sophisticated
analytics, ease of use, and speed.
The ideal data platform
Databricks offers a cloud platform powered
by Spark, that makes it easy to turn data into
value, from ingest to production, without the
hassle of managing complex infrastructure,
systems and tools. A complete solution for
data scientists and engineers.
Databricks
Notebooks
Jobs
Dashboards
Third-Party Apps
Cluster Manager
• Managed Spark Clusters in the Cloud
• Notebook Environment
• Production Pipeline Scheduler
• 3rd Party Applications
How customers use Databricks
Prepare data
Perform analytics
Build data products
• Import data using APIs
• Explore large data sets
in real-time
• Rapid prototyping
or connectors
• Cleanse mal-formed data
• Aggregate data to create a
data warehouse
• Find hidden patterns
with regression analysis
• Publish customized
dashboards
• Implement advanced
analytics algorithms
• Create and monitor robust
production pipelines
How Databricks can help your business
databricks.com
info@databricks.com
1-866-330-0121
Higher productivity
Faster deployment
of data pipelines
Data democratization
Focus on finding answers. And capture the full value of your data.
Effortlessly manage large-scale Spark clusters
Accelerate your work with an interactive workspace
Spin up and scale out clusters to hundreds of nodes and beyond with
just a few clicks, without IT or DevOps. Easily harness the power of
Spark for streaming, machine learning, graph processing, and more.
Work interactively while automatically documenting your progress in
notebooks — in R, Python, Scala, or SQL. Visualize data in just a few
clicks, and use familiar tools like matplotlib, ggplot or d3.
Run your production jobs at scale
Collaborate interactively
Put new applications in production with one click by scheduling
either notebooks or JARs. Monitor the progress of production jobs
and set up automated alerts to notify you of changes.
Seamlessly share notebooks, collaborate in the same code base,
comment on each other’s work, and track activities.
Publish your analysis with customized dashboards
Connect your favorite apps
Build and articulate your findings in dashboards in a few clicks.
Set up dashboards to update automatically through jobs.
Run your favorite BI tools or sophisticated third-party applications
on Databricks Cloud.