“DevOps for Big Data” โดย คุณศุภเกศ วงศ์คำภู Solution Architect at Enersys.co.th สัมมนา Big Data & Analytics โดย ดาต้า คิวบ์ (facebook.com/datacube.th) DevOps for Big Data Software Every Thing @ Enersys • FICO (Thailand) (Past) • DST (Thailand) (Past) • Thomson Reuter (Thailand) (Past) • Meta Genesis Development (Past) @Supaket http://facebook.com/supaket https://www.linkedin.com/in/supaket DevOps for Big Data by @Supaket 4 April 2015 Software Engineering practice Time to market Dev build faster test in production like reduce time to test virtualization dev & test deploy faster deploy often increase coverage Ops http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 What is DevOps? - In Simple English http://www.youtube.com/watch?v=_I94-tJlovg DevOps for Big Data by @Supaket 4 April 2015 DevOps DevOps (a portmanteau of "development" and "operations") is a concept dealing with, among other things: software development, operations, and services. It emphasises communication, collaboration, and integration between software developers and information technology (IT) operations personnel. en.wikipedia.org/wiki/DevOps DevOps for Big Data by @Supaket 4 April 2015 DevOps Culture Tools Mind Set of Culture, Process and Tools adoption to make software more quality, faster develop/test/ release, for speed up time to market Process supaket DevOps for Big Data by @Supaket 4 April 2015 2014 State of DevOps report Strong IT performance is a competitive advantage. Firms with high-performing IT organisations were twice as likely to exceed their profitability, market share and productivity goals DevOps for Big Data by @Supaket 4 April 2015 2014 State of DevOps report DevOps practices improve IT performance. IT performance strongly correlates with well-known DevOps practices such as use of version control and continuous delivery. The longer an organization has implemented — and continues to improve upon — DevOps practices, the better it performs. And better IT performance correlates to higher performance for the entire organization. DevOps for Big Data by @Supaket 4 April 2015 2014 State of DevOps report Organizational culture matters. Organizational culture is one of the strongest predictors of both IT performance and overall performance of the organization. Hightrust organizations encourage good information flow, cross-functional collaboration, shared responsibilities, learning from failures and new ideas; they are also the most likely to perform at a high level. These cultural practices and norms found in hightrust organizations are also at the heart of DevOps, which helps explain why DevOps practices correlate so strongly with high organizational performance. DevOps for Big Data by @Supaket 4 April 2015 2014 State of DevOps report Job satisfaction is the No. 1 predictor of organisational performance. We all know how job satisfaction feels: It’s about doing work that’s challenging and meaningful, and being empowered to exercise our skills and judgment. We also know that where there’s job satisfaction, employees bring the best of themselves to work: their engagement, their creativity and their strongest thinking. That makes for more innovation in any area of the business, including IT. DevOps for Big Data by @Supaket 4 April 2015 Production vs Development environment What the problem? DevOps for Big Data by @Supaket 4 April 2015 Common Problems It works on my machine http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Common Problems http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Common Problems Reproducible http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Common Problems http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Common Problems 20 Guys join team, How to Start develop in 1st Day? http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Common Problems Production Like environment http://www.blue-agility.com/important-lesson-getting-code-production/ DevOps for Big Data by @Supaket 4 April 2015 Introduction to Virtualization Production Environment Production Like environment Developer Machine http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 What ’s about virtualization ? Hypervisor Container DevOps for Big Data by @Supaket 4 April 2015 What is Vagrant & Docker ? DevOps for Big Data by @Supaket 4 April 2015 What is Vagrant? Vagrant is a tool for building complete development environments. With an easy-to-use workflow and focus on automation, Vagrant lowers development environment setup time, increases development/production parity, and makes the "works on my machine" excuse a relic of the past. Vagratup.com • • A VM management tool Automate the setup of your environment ( Dev & QA ) DevOps for Big Data by @Supaket 4 April 2015 Vagrant. Vagrant Command - init up halt reload pause resume destroy package http://newrelic.com/devops/lifecycle DevOps for Big Data by @Supaket 4 April 2015 Vagrant - Big Picture DevOps for Big Data by @Supaket 4 April 2015 Vagrant - Network Mode DevOps for Big Data by @Supaket 4 April 2015 Vagrant for Developer Machine New Joiner • Someone joins your project… • They pick up their laptop… • Then spend the next 1-2 days following instructions on setting up their environment, tools, etc. DevOps for Big Data by @Supaket 4 April 2015 What is Docker? Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. Consisting of Docker Engine, a portable, lightweight runtime and packaging tool, and Docker Hub, a cloud service for sharing applications and automating workflows, Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud. Solomon Hykes, Docker’s Founder & CTO, gives an overview of Docker in this short video (7:16). DevOps for Big Data by @Supaket 4 April 2015 What is Docker? DevOps for Big Data by @Supaket 4 April 2015 Docker for shipping an immune environment DevOps for Big Data by @Supaket 4 April 2015 Apache'Spark An'introduction'to'Spark'and'Spark'streaming DevOps for Big Data by @Supaket 4 April 2015 What'is'Apache'Spark? • Cluster'computing'engine'designed'to'be'fast'and'general:purpose' • Good'for'Processing'data'streaming' • Good'for'Machine'learning'task' • Unified'platform DevOps for Big Data by @Supaket 4 April 2015 Spark'Components DevOps for Big Data by @Supaket 4 April 2015 Spark'Core • Basic'functionality'of'Spark,'including'components'for'task'scheduling,' memory'management,'fault'recovery,'interacting'with'storage' systems,'and'more' • Provide(API(for(Resilient(distributed(datasets'(RDDs) DevOps for Big Data by @Supaket 4 April 2015 Concept':'Resilient'distributed'datasets'(RDDs) • Immutable'Collections'of'objects'spread'across'a'cluster' • Built'through'parallel'transformations'(map,'filter,'etc.)' • Controllable'persistence'(e.g.'caching'in'RAM)' • Automatically'rebuilt'on'failure' • Contain'any'type'of'Python,'Java,'or'Scala'objects,'including'user:defined'classes. Key'Idea:'Write'programs'in'terms'of'transformations'on' distributed'datasets DevOps for Big Data by @Supaket 4 April 2015 Spark'Streaming'(1) • Spark'component'that'enables'processing'of'live%streams'of'data'' i.e.'production'log'file,'queue,'' • Provide'an'API'for'manipulate'data'stream'(DStream)'' • Fault'tolerance,'throughput,'and'scalability'as'Spark'Core.' • Spark’s'built:in'machine'learning'algorithms'and'graph'processing' algorithms'can'be'applied'to'data'streams DevOps for Big Data by @Supaket 4 April 2015 Spark'Streaming'(2) • Chop'up'the'live'stream'into'batches'of'X'seconds' • Spark'treats'each'batch'of'data'as'RDDs'' '''and'processes'them'using'RDD'operations' • Finally,'the'processed'results'of'' '''the'RDD'operations'are'returned'in'batches DevOps for Big Data by @Supaket 4 April 2015 Log anomaly detection in production Apache'Spark Input'Reader APACHE'LOG'Reader JsonMesage DSTREAM PredictionModel production environment RDD FileOutPut YARN Result Output Vagrant DevOps for Big Data by @Supaket 4 April 2015 Log anomaly detection in Development Apache'Spark Input'Reader APACHE'LOG'Reader JsonMesage DSTREAM PredictionModel developer machine RDD FileOutPut YARN Docker Docker Result Output Vagrant DevOps for Big Data by @Supaket 4 April 2015 Show case Running Demo DevOps for Big Data by @Supaket 4 April 2015 Q&A Thank you DevOps for Big Data by @Supaket 4 April 2015 Reference http://www.devopsdays.in.th http://www.devopsdays.org http://devopscafe.org http://vimeo.com/devopsdays http://newrelic.com/devops/lifecycle http://www.slideshare.net/search/slideshow?searchfrom=header&q=devops DevOps for Big Data by @Supaket 4 April 2015
© Copyright 2025