{"id":889,"date":"2014-06-29T23:34:00","date_gmt":"2014-06-29T15:34:00","guid":{"rendered":"http:\/\/www.brofive.org\/?p=889"},"modified":"2018-04-24T00:28:13","modified_gmt":"2018-04-23T16:28:13","slug":"google-cloud-dataflow%e6%98%af%e4%bb%80%e4%b9%88%ef%bc%9f","status":"publish","type":"post","link":"http:\/\/www.brofive.net\/?p=889","title":{"rendered":"Google Cloud Dataflow\u662f\u4ec0\u4e48\uff1f"},"content":{"rendered":"<p><font size=\"4\">\u51e0\u5929\u524d\u7684Google IO\u5927\u4f1a\u4e0a\u6b63\u5f0f\u5ba3\u5e03Mapreduce\u7684\u540e\u7eed\u4ea7\u54c1Cloud Dataflow\uff0c\u8fd9\u4e2a\u4ea7\u54c1\u5c06\u4f5c\u4e3a\u4e00\u9879\u7ade\u4e89\u4ea7\u54c1\u6210\u4e3aGoogle\u4e91\u8ba1\u7b97\u5e73\u53f0\u7684\u4e00\u90e8\u5206\uff0c\u878d\u5408\u4e86\u6279\u5904\u7406\u548c\u6d41\u8ba1\u7b97\uff0c\u57fa\u672c\u4e0a\u662f\u4f5c\u4e3aETL\u548cStreaming\u5de5\u5177\uff0c\u800c\u540e\u7aef\u7684\u5206\u6790\u4ea4\u7ed9\u4e86BigQuery\uff08Dremel\uff09\uff1bCDF\u4e3b\u8981\u662f\u9488\u5bf9AWS\u63a8\u51fa\u7684DataPipeline\u3001Kinesis\u7b49\u6d41\u5f0f\u6570\u636e\u5904\u7406\u4ea7\u54c1\u3002\u4e3b\u8981\u5305\u62ec3\u4e2a\u529f\u80fd\uff1a<\/font> <\/p>\n<p><font size=\"4\">\u00b7 for data integration and preparation (e.g. in preparation for interactive SQL in BigQuery)<\/font> <\/p>\n<p><font size=\"4\">\u00b7 to examine a real-time stream of events for significant patterns and activities<\/font> <\/p>\n<p><font size=\"4\">\u00b7 to implement advanced, multi-step processing pipelines to extract deep insight from datasets of any size<\/font> <\/p>\n<p><font size=\"4\">Google CDF\u57fa\u4e8e\u5185\u90e8\u7684Flume\u548cMillwheel\uff0c<strike>\u4f46\u662f\u4ece\u67d0\u79cd\u610f\u4e49\u4e0a\u770b\uff0cCDF\u5e76\u4e0d\u6392\u65a5MR<\/strike>\uff0c\u81f3\u5c11\u5728Flume\u76f8\u5173\u7684\u8bba\u6587\u4e2d\u4e5f\u5e76\u672a\u8bf4\u660e\u6709\u4ec0\u4e48\u66ff\u4ee3\u65b9\u6848\uff0c\u800cMillWheel\u5219\u662fstreaming\u3002Google CDF\u5bf9Hadoop\u793e\u533a\u53ef\u80fd\u4e0d\u4f1a\u5e26\u6765\u5f88\u5927\u7684\u5f71\u54cd\uff0c\u793e\u533a\u76ee\u524d\u5df2\u7ecf\u9010\u6b65\u8f6c\u79fb\u5230Spark\u5e73\u53f0\u4e0a\uff0c\u800cGoogle\u7531\u4e8e\u5185\u90e8\u7cfb\u7edf\u5b9e\u9645\u4e0a\u4e5f\u662f\u57281-2\u5e74\u524d\u5f00\u59cb\u542f\u52a8\u8fc1\u79fb\u5230CDF\u7684\uff0c\u76ee\u524d\u5176\u516c\u53f8\u5185\u90e8\u5e94\u8be5\u8fd8\u6709\u4e00\u4e9bMR\u7684\u7cfb\u7edf\u5728\u8fd0\u884c\u3002\u8fd9\u6b21\u5ba3\u5e03\u7684\u662f\u5c06CDF\u4f5c\u4e3a\u4e00\u79cd\u670d\u52a1\u63d0\u4f9b\u7ed9\u6700\u7ec8\u7528\u6237\uff08\u5305\u62ecSnapchat\u3001Rising Star\u7b49\u79fb\u52a8\u6d88\u606f\u5e94\u7528\uff09<\/font> <\/p>\n<p><font size=\"4\"><strike>\u4e2a\u4eba\u8ba4\u4e3a\uff0c\u56fd\u5185\u5916\u4e00\u4e9b\u5a92\u4f53\u8bf4Google\u653e\u5f03MR\uff0c\u6216\u8005\u8bf4\u7528CDF\u66ff\u6362MR\u7684\u8bf4\u6cd5\u5b58\u5728\u95ee\u9898\uff0cCDF\u5f88\u53ef\u80fd\u662f\u878d\u5408\u4e86MillWheel\u3001Flume\u3001MR\u7684\u4e00\u79cd\u7cfb\u7edf\uff0c\u5f53\u7136MR\u672c\u8eab\u53ef\u80fd\u88ab\u4f18\u5316\u548c\u5b8c\u5584\uff0c\u6bd4\u5982\u91c7\u7528\u5185\u5b58\u6280\u672f\uff0c\u6bd5\u7adfGFS\u4e5f\u90fd\u6539\u540d\u53eb\u505aCNS\u4e86\uff0c\u4e3a\u5565MR\u5c31\u4e0d\u80fd\u4f18\u5316\u3002<\/strike><\/font> <\/p>\n<p><font size=\"4\">\u6709\u5174\u8da3\u6df1\u5165\u9605\u8bfb\u7684\u8bf7\u7ee7\u7eed\uff1a<\/font> <\/p>\n<p><font size=\"4\">1\u3001<\/font><a href=\"http:\/\/googlecloudplatform.blogspot.ae\/2014\/06\/sneak-peek-google-cloud-dataflow-a-cloud-native-data-processing-service.html\"><font size=\"4\">http:\/\/googlecloudplatform.blogspot.ae\/2014\/06\/sneak-peek-google-cloud-dataflow-a-cloud-native-data-processing-service.html<\/font><\/a> <\/p>\n<p><font size=\"4\">Sneak peek: Google Cloud Dataflow, a Cloud-native data processing service<\/font> <\/p>\n<p><font size=\"4\"><b>Posted:<\/b> Thursday, June 26, 2014<\/font> <\/p>\n<p><font size=\"4\">In today&#8217;s world, information is being generated at an incredible rate. However, unlocking insights from large datasets can be cumbersome and costly, even for experts.<br \/>It doesn\u2019t have to be that way. Yesterday, at <\/font><a href=\"https:\/\/www.google.com\/io\"><font size=\"4\">Google I\/O<\/font><\/a><font size=\"4\">, you got a sneak peek of Google Cloud Dataflow, the latest step in our effort to make data and analytics accessible to everyone. You can use Cloud Dataflow:<\/font> <\/p>\n<p><font size=\"4\">\u00b7 for data integration and preparation (e.g. in preparation for interactive SQL in BigQuery)<\/font> <\/p>\n<p><font size=\"4\">\u00b7 to examine a real-time stream of events for significant patterns and activities<\/font> <\/p>\n<p><font size=\"4\">\u00b7 to implement advanced, multi-step processing pipelines to extract deep insight from datasets of any size<\/font> <\/p>\n<p><font size=\"4\">In these cases and many others, you use Cloud Dataflow\u2019s data-centric model to easily express your data processing pipeline, monitor its execution, and get actionable insights from your data, free from the burden of deploying clusters, tuning configuration parameters, and optimizing resource usage. Just focus on your application, and leave the manag<\/font> <\/p>\n<p><font size=\"4\">2\u3001<\/font><a href=\"http:\/\/googlecloudplatform.blogspot.com\/2014\/06\/reimagining-developer-productivity-and-data-analytics-in-the-cloud-news-from-google-io.html\"><font size=\"4\">http:\/\/googlecloudplatform.blogspot.com\/2014\/06\/reimagining-developer-productivity-and-data-analytics-in-the-cloud-news-from-google-io.html<\/font><\/a> <\/p>\n<h6><font size=\"4\">Reimagining developer productivity and data analytics in the cloud &#8211; news from Google IO<\/font><\/h6>\n<p><font size=\"4\"><b>Posted:<\/b> Wednesday, June 25, 2014<\/font> <\/p>\n<p><font size=\"4\">Today at <\/font><a href=\"https:\/\/www.google.com\/events\/io\"><font size=\"4\">Google I\/O<\/font><\/a><font size=\"4\">, we are introducing new services that help developers build and optimize data pipelines, create mobile applications, and debug, trace, and monitor their cloud applications in production.<\/font><\/p>\n<p><font size=\"4\"> <\/p>\n<p><b>Introducing Google Cloud Dataflow<\/b><br \/>A decade ago, Google invented MapReduce to process massive datasets using distributed computing. Since then, more devices and information require more capable analytics pipelines \u2014 though they are difficult to create and maintain.<br \/>Today at Google I\/O, we are demonstrating Google Cloud Dataflow for the first time. Cloud Dataflow is a fully managed service for creating data pipelines that ingest, transform and analyze data in both batch and streaming modes. Cloud Dataflow is a successor to MapReduce, and is based on our internal technologies like <\/font><a href=\"http:\/\/dl.acm.org\/citation.cfm?id=1806638\"><font size=\"4\">Flume<\/font><\/a><font size=\"4\"> and<\/font><a href=\"http:\/\/research.google.com\/pubs\/pub41378.html\"><font size=\"4\">MillWheel<\/font><\/a><font size=\"4\">.<br \/>Cloud Dataflow makes it easy for you to get actionable insights from your data while lowering operational costs without the hassles of deploying, maintaining or scaling infrastructure. You can use Cloud Dataflow for use cases like ETL, batch data processing and streaming analytics, and it will automatically optimize, deploy and manage the code and resources required.<\/font><\/p>\n<p><font size=\"4\"> <\/p>\n<p><b>Debug, trace and monitor your application in production<\/b><br \/>We are also introducing several new Cloud Platform tools that let developers understand, diagnose and improve systems in production.<br \/>Google Cloud Monitoring is designed to help you find and fix unusual behavior across your application stack. Based on technology from our recent acquisition of Stackdriver, Cloud Monitoring provides rich metrics, dashboards and alerting for Cloud Platform, as well as more than a dozen popular open source apps, including Apache, Nginx, MongoDB, MySQL, Tomcat, IIS, Redis, Elasticsearch and more. For example, you can use Cloud Monitoring to identify and troubleshoot cases where users are experiencing increased error rates connecting from an App Engine module or slow query times from a Cassandra database with minimal configuration.<br \/>We know that it can be difficult to isolate the root cause of performance bottlenecks. Cloud Trace helps you visualize and understand time spent by your application for request processing. In addition, you can compare performance between various releases of your application using latency distributions.<br \/>Finally, we\u2019re introducing Cloud Debugger, a new tool to help you debug your applications in production with effectively no performance overhead. Cloud Debugger gives you a full stack trace and snapshots of all local variables for any watchpoint that you set in your code while your application continues to run undisturbed in production. This brings modern debugging to cloud-based applications.<\/p>\n<p><b>New features for mobile development<\/b><br \/>With rapid autoscaling, caching and other mobile friendly capabilities, many apps like <\/font><a href=\"https:\/\/play.google.com\/store\/apps\/details?id=com.snapchat.android&amp;hl=en\"><font size=\"4\">Snapchat<\/font><\/a><font size=\"4\"> or<\/font><a href=\"https:\/\/play.google.com\/store\/apps\/details?id=com.risingstar.android\"><font size=\"4\">Rising Star<\/font><\/a><font size=\"4\"> have built and run on Cloud Platform. We\u2019re adding new features that make building a mobile app using Cloud Platform even better.<br \/>Today, we\u2019re demonstrating a new version of Google Cloud Save, which gives you a simple API for saving, retrieving, and synchronizing user data to the cloud and across devices without needing to code up the backend. Data is stored in <\/font><a href=\"https:\/\/cloud.google.com\/products\/cloud-datastore\/\"><font size=\"4\">Google Cloud Datastore<\/font><\/a><font size=\"4\">, making the data accessible from<\/font><a href=\"https:\/\/cloud.google.com\/products\/app-engine\/\"><font size=\"4\">Google App Engine<\/font><\/a><font size=\"4\"> or <\/font><a href=\"https:\/\/cloud.google.com\/products\/compute-engine\/\"><font size=\"4\">Google Compute Engine<\/font><\/a><font size=\"4\"> using the existing <\/font><a href=\"https:\/\/developers.google.com\/datastore\/docs\/apis\/overview\"><font size=\"4\">Datastore API<\/font><\/a><font size=\"4\">. Google Cloud Save is currently in private beta and will be available for general use soon.<br \/>We\u2019ve also added tooling to <\/font><a href=\"https:\/\/developer.android.com\/sdk\/installing\/studio.html\"><font size=\"4\">Android Studio<\/font><\/a><font size=\"4\">, which simplifies the process of adding an App Engine backend to your mobile app. In particular, Android Studio now has three built-in App Engine backend module templates, including <\/font><a href=\"https:\/\/github.com\/GoogleCloudPlatform\/gradle-appengine-templates\/tree\/master\/HelloWorld\"><font size=\"4\">Java Servlet<\/font><\/a><font size=\"4\">, <\/font><a href=\"https:\/\/github.com\/GoogleCloudPlatform\/gradle-appengine-templates\/tree\/master\/HelloEndpoints\"><font size=\"4\">Java Endpoints<\/font><\/a><font size=\"4\"> and an <\/font><a href=\"https:\/\/github.com\/GoogleCloudPlatform\/gradle-appengine-templates\/tree\/master\/GcmEndpoints\"><font size=\"4\">App Engine backend with Google Cloud Messaging<\/font><\/a><font size=\"4\">. Since this functionality is powered by the open-source <\/font><a href=\"https:\/\/github.com\/GoogleCloudPlatform\/gradle-appengine-plugin\"><font size=\"4\">App Engine plug-in for Gradle,<\/font><\/a><font size=\"4\"> you can use the same build configuration for both your app and your backend across IDE, CLI and Continuous Integration environments.<br \/>We\u2019ll be doing more detailed follow-up posts about these announcements in the coming days, so stay tuned.<br \/>-Posted by Greg DeMichillie, Director of Product Management<br \/><i>*Apache, Nginx, MongoDB, MySQL, Tomcat, IIS, Redis, Elasticsearch and Cassandra are trademarks of their respective owners.<\/i><\/font><\/p>\n<h5><font size=\"4\">Google Launches Cloud Dataflow, A Managed Data Processing Service<\/font><\/h5>\n<p><i><font size=\"4\">Posted Jun 25, 2014 by <\/font><a href=\"http:\/\/techcrunch.com\/author\/frederic-lardinois\/\"><font size=\"4\">Frederic Lardinois<\/font><\/a><font size=\"4\"> (<\/font><a href=\"https:\/\/twitter.com\/fredericl\"><font size=\"4\">@fredericl<\/font><\/a><font size=\"4\">)<\/font><\/i> <\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/1-8.jpg\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"1\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"1\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/1_thumb-8.jpg\" width=\"563\" height=\"379\"><\/font><\/a><font size=\"4\"> <\/font><\/p>\n<p><font size=\"4\">Google expanded its Cloud Platform today with a new managed service called Cloud Dataflow that allows developer to <strong>create data pipelines to help them ingest, transform and \u2014 most importantly \u2014 analyze data<\/strong>.Developers can use the service to work with streaming real-time data and by uploading batches of data to the system.<\/font> <\/p>\n<p><font size=\"4\">For now, the service is in private beta and it\u2019s unclear how Google will price Dataflow once it is launched to the public. At its core, Cloud Dataflow is Google\u2019s successor to <\/font><a href=\"https:\/\/developers.google.com\/appengine\/docs\/python\/dataprocessing\/\"><font size=\"4\">MapReduce<\/font><\/a><font size=\"4\">, which has been an experimental App Engine feature for quite a while now.<\/font> <\/p>\n<p><font size=\"4\">The company says Dataflow is based on a number of technologies the company has been using internally, including <\/font><a href=\"http:\/\/dl.acm.org\/citation.cfm?id=1806638\"><font size=\"4\">Flume<\/font><\/a><font size=\"4\"> and <\/font><a href=\"http:\/\/research.google.com\/pubs\/pub41378.html\"><font size=\"4\">MillWheel<\/font><\/a><font size=\"4\">. Google is using Java for the first Cloud Dataflow SDK, but it is also providing a dashboard for monitoring these pipelines right from the developer console.<\/font> <\/p>\n<p><font size=\"4\"><\/font>&nbsp;<\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/2-8.jpg\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"2\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"2\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/2_thumb-8.jpg\" width=\"564\" height=\"380\"><\/font><\/a><font size=\"4\"> <\/font><\/p>\n<p><font size=\"4\">The focus here, according to Google, is to help its users get \u201cactionable insights from your data while lowering operational costs without the hassles of deploying, maintaining or scaling infrastructure.\u201d<\/font> <\/p>\n<p><font size=\"4\">Because this is a private beta, Google isn\u2019t publishing any throughput numbers just yet, but the service will be able to ingest virtually any kind of data in its streaming mode and newline-delimited text files, BigQuery tables and similar data in its batch mode.<\/font> <\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/3-8.jpg\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"3\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"3\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/3_thumb-7.jpg\" width=\"565\" height=\"378\"><\/font><\/a><font size=\"4\"> <\/font><\/p>\n<p><font size=\"4\">With this service, Google closes a major hole in its Cloud Platform lineup. For quite a while now, Amazon has offered its own <\/font><a href=\"http:\/\/aws.amazon.com\/datapipeline\/\"><font size=\"4\">data pipeline service<\/font><\/a><font size=\"4\">, and with <\/font><a href=\"http:\/\/aws.amazon.com\/kinesis\"><font size=\"4\">Kinesis<\/font><\/a><font size=\"4\">, it launched a service that specializes in real-time data processing at its developer conference last November.<\/font> <\/p>\n<p><font size=\"4\">Previously, Google\u2019s focus in this area had mostly been on MapReduce and BigQuery. Google tells BigQuery is complementary to Dataflow. Developers can use Dataflow as a part of the data ingestions into BigQuery, for example, by preparing or filtering the data for BigQuery. Once the data is cleaned, it can be written to BigQuery, where it becomes immediately accessible. At the same time, though, Dataflow can be used to read from BigQuery in case you want to join data from your database with other data sources. And to complete the cycle, you can then write all of this back to BigQuery, too, of course.<\/font> <\/p>\n<p><font size=\"4\">In a demo during today\u2019s keynote, Google showed how its engineers, with the help of Twitter, used this service to do sentiment analysis around the World Cup by looking at millions of tweets.<\/font> <\/p>\n<p><a href=\"http:\/\/www.datacenterknowledge.com\/archives\/2014\/06\/25\/google-dumps-mapreduce-favor-new-hyper-scale-analytics-system\/\"><font size=\"4\">http:\/\/www.datacenterknowledge.com\/archives\/2014\/06\/25\/google-dumps-mapreduce-favor-new-hyper-scale-analytics-system\/<\/font><\/a> <\/p>\n<p><font size=\"4\"><font color=\"#800080\"><strong>Google<\/strong> has abandoned MapReduce, the system for running data analytics jobs spread across many servers the company developed and later open sourced, in favor of a new cloud analytics system it has built called Cloud Dataflow.<\/font><\/font> <\/p>\n<p><font size=\"4\">MapReduce has been a highly popular infrastructure and programming model for doing parallelized distributed computing on server clusters. It is the basis of Apache Hadoop, the Big Data infrastructure platform that has enjoyed widespread deployment and become core of many companies\u2019 commercial products.<\/font> <\/p>\n<p><font size=\"4\">The technology is unable to handle the amounts of data Google wants to analyze these days, however. Urs H\u00f6lzle, senior vice president of technical infrastructure at the Mountain View, California-based giant, said it got too cumbersome once the size of the data reached a few petabytes.<\/font> <\/p>\n<p><font size=\"4\">\u201cWe don\u2019t really use MapReduce anymore,\u201d H\u00f6lzle said in his keynote presentation at the Google I\/O conference in San Francisco Wednesday. The company stopped using the system \u201cyears ago.\u201d<\/font> <\/p>\n<p><font size=\"4\">Cloud Dataflow, which Google will also offer as a service for developers using its cloud platform, does not have the scaling restrictions of MapReduce.<\/font> <\/p>\n<p><font size=\"4\">\u201cCloud Dataflow is the result of over a decade of experience in analytics,\u201d H\u00f6lzle said. \u201cIt will run faster and scale better than pretty much any other system out there.\u201d<\/font> <\/p>\n<p><font size=\"4\">It is a fully managed service that is automatically optimized, deployed, managed and scaled. It enables developers to easily create complex pipelines using unified programming for both batch and streaming services, he said.<\/font> <\/p>\n<p><font size=\"4\">All these characteristics address what Google thinks does not work in MapReduce: it is hard to ingest data rapidly, it requires a lot of different technology, batch and streaming are unrelated, and deployment and operation of MapReduce clusters is always required.<\/font> <\/p>\n<p><strong><font size=\"4\">H\u00f6lzle announced other new services on Google\u2019s cloud platform at the show:<\/font><\/strong> <\/p>\n<p><font size=\"4\">\u00a7 <strong>Cloud Save<\/strong> is an API that enables an application to save an individual user\u2019s data in the cloud or elsewhere and use it without requiring any server-side coding. Users of Google\u2019s Platform-as-a-Service offering App Engine and Infrastructure-as-a-Service offering Compute Engine can build apps using this feature.<\/font> <\/p>\n<p><font size=\"4\">\u00a7 <strong>Cloud Debugging<\/strong> makes it easier to sift through lines of code deployed across many servers in the cloud to identify software bugs.<\/font> <\/p>\n<p><font size=\"4\">\u00a7 <strong>Cloud Tracing<\/strong> provides latency statistics across different groups (latency of database service calls for example) and provides analysis reports.<\/font> <\/p>\n<p><font size=\"4\">\u00a7 <strong>Cloud Monitoring<\/strong> is an intelligent monitoring system that is a result of integration with Stackdriver, a <\/font><a href=\"http:\/\/www.datacenterknowledge.com\/archives\/2014\/05\/09\/google-acquires-popular-cloud-monitoring-firm-stackdriver\/\"><font size=\"4\">cloud monitoring startup Google bought in May<\/font><\/a><font size=\"4\">. The feature monitors cloud infrastructure resources, such as disks and virtual machines, as well as service levels for Google\u2019s services as well as more than a dozen non-Google open source packages.<\/font> <\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/4.png\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"4\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"4\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/4_thumb.png\" width=\"613\" height=\"410\"><\/font><\/a><\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/5.png\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"5\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"5\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/5_thumb.png\" width=\"607\" height=\"406\"><\/font><\/a><\/p>\n<p><a href=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/6.png\"><font size=\"4\"><img loading=\"lazy\" decoding=\"async\" title=\"6\" style=\"border-top: 0px; border-right: 0px; border-bottom: 0px; border-left: 0px; display: inline\" border=\"0\" alt=\"6\" src=\"http:\/\/www.brofive.net\/wp-content\/uploads\/2018\/04\/6_thumb.png\" width=\"609\" height=\"422\"><\/font><\/a><font size=\"4\"> <\/font><\/p>\n<p><font size=\"4\"><\/font>&nbsp;<\/p>\n<p><font size=\"4\">\u76f8\u5173\u4fe1\u606f\uff1a<\/font> <\/p>\n<p><font size=\"4\">1\u3001<\/font><a href=\"http:\/\/www.infoworld.com\/t\/hadoop\/why-google-cloud-dataflow-no-hadoop-killer-245212\"><font size=\"4\">http:\/\/www.infoworld.com\/t\/hadoop\/why-google-cloud-dataflow-no-hadoop-killer-245212<\/font><\/a> <\/p>\n<p><font size=\"4\">2\u3001<\/font><a href=\"http:\/\/googlecloudplatform.blogspot.ae\/2014\/06\/sneak-peek-google-cloud-dataflow-a-cloud-native-data-processing-service.html\"><font size=\"4\">Sneak peek: Google Cloud Dataflow, a Cloud-native data processing service<\/font><\/a> <\/p>\n<p><font size=\"4\">3\u3001<\/font><a href=\"http:\/\/googlecloudplatform.blogspot.com\/2014\/06\/reimagining-developer-productivity-and-data-analytics-in-the-cloud-news-from-google-io.html\"><font size=\"4\">Reimagining developer productivity and data analytics in the cloud &#8211; news from Google IO<\/font><\/a> <\/p>\n<p><font size=\"4\">4\u3001<\/font><a href=\"http:\/\/techcrunch.com\/2014\/06\/25\/google-launches-cloud-dataflow-a-managed-data-processing-service\/\"><font size=\"4\">Google Launches Cloud Dataflow, A Managed Data Processing Service<\/font><\/a> <\/p>\n<p><font size=\"4\">5\u3001<\/font><a href=\"http:\/\/www.zdnet.com\/google-launches-cloud-dataflow-says-mapreduce-tired-7000030937\/\"><font size=\"4\">Google launches Cloud Dataflow, says MapReduce tired<\/font><\/a> <\/p>\n<p><font size=\"4\">6\u3001<\/font><a href=\"http:\/\/www.tableausoftware.com\/about\/blog\/2014\/6\/google-cloud-dataflow-31503\"><font size=\"4\">Why Google&#8217;s Unveiling of Cloud Dataflow is Great News for Tableau Users<\/font><\/a> <\/p>\n<p><font size=\"4\">7\u3001<\/font><a href=\"http:\/\/www.datacenterknowledge.com\/archives\/2014\/06\/25\/google-dumps-mapreduce-favor-new-hyper-scale-analytics-system\/\"><font size=\"4\">Google Dumps MapReduce in Favor of New Hyper-Scale Analytics System<\/font><\/a> <\/p>\n<p><font size=\"4\">8\u3001<\/font><a href=\"http:\/\/www.infoq.com\/cn\/news\/2014\/06\/google-cloud-dataflow\"><font size=\"4\">http:\/\/www.infoq.com\/cn\/news\/2014\/06\/google-cloud-dataflow<\/font><\/a> <\/p>\n<p><font size=\"4\"><\/font><\/p>\n<p><font size=\"4\"><\/font><\/p>\n<p><font size=\"4\"><\/font><\/p>\n<p><font size=\"4\"><\/font><\/p>\n<p><font size=\"4\"><\/font><\/p>\n<p><font size=\"4\"><\/font><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u51e0\u5929\u524d\u7684Google IO\u5927\u4f1a\u4e0a\u6b63\u5f0f\u5ba3\u5e03Mapreduce\u7684\u540e\u7eed\u4ea7\u54c1Cloud Dataflow\uff0c\u8fd9\u4e2a\u4ea7\u54c1\u5c06\u4f5c&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22,119],"tags":[155],"views":4294,"_links":{"self":[{"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/posts\/889"}],"collection":[{"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/www.brofive.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=889"}],"version-history":[{"count":1,"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/posts\/889\/revisions"}],"predecessor-version":[{"id":890,"href":"http:\/\/www.brofive.net\/index.php?rest_route=\/wp\/v2\/posts\/889\/revisions\/890"}],"wp:attachment":[{"href":"http:\/\/www.brofive.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=889"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.brofive.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=889"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.brofive.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=889"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}