基於數據湖架構的大數據平台:品高雲與Gartner聯合報告正式上線

  • 數據湖的數據資源支持按主題、組織、專題等維度編目數據,保障數據的可檢索性

    Data resources of the data lake can be catalogued according to subjects, organizations and features, ensuring data’s findability

  • 可通過數據及時性、數據完整性、數據一致性、數據準確性等多個維度監控和分析數據湖的數據質量,並能夠實現數據質量監控、分析、檢查、報告的閉環管理,此外,還支持數據消費者對數據資源的質量進行評價評論,持續提升數據湖的數據質量

    Data quality canbe monitored and analyzed in terms of data’s timeliness, integrity, consistency, and accuracy, and it’s possible to perform a closed-loop management of the monitoring, analysis, inspection and report of the data quality. Moreover, data consumers can also evaluate and comment on the quality of the data resources, which will continuously improve the data quality of the data lake

  • 能夠實現從數據集成、數據存儲、數據處理、數據消費的全過程性能指標的監控分析,實時監控分析各個環節的處理情況,幫助管理人員第一時間掌握數據湖的整體運行狀況,對於數據湖的運營、可持續發展具有指導意義

    Monitoring and analysis of the performance indexes can be achieved throughout the process ofthe integration, storage, processing and consumption of data. It will monitorand analyze in real time the handling of each link, which can help the managers to grasp the overall running conditions of the data lake in the first place andhas guiding significance for the operation and sustainable development of a data lake


數據分析與消費

Data Analysis and Consumption

當大量數據被採集到數據湖中,經過開發處理,再將處理後的可用數據存入回數據湖,為各類大數據分析應用提供數據支撐。聯繫品高雲家的小表妹(ID:pingaoyunzzm)瞭解更多。

Massive data can be collected into the data lake and then developed and processed. Processed available data can then be stored back into the data lake, providing data support for various big data analysis applications.

品高數據湖方案中提供大數據分析平臺,通過自助分析、數據可視化等多種方式讓用戶進行數據消費,自由發掘數據的潛能和價值。平臺中內置儀表盤、數據源管理、數據報表、數據報告以及與地理位置信息結合的數據運算和展示等多種分析組件,同時還可以支持第三方的數據分析工具、以及用戶自己開發的分析工具等。

Bingo Data Lake solutions provide platforms for big data analytics, and enable users to conduct data consumption and explore the potential and value of data by means of self-analysis and data visualization. Built-in analysis components in the platforms include dashboards, data source management, data reports, and data processing and demonstration combined with geographic positions. Meanwhile, third-party data analysis tools and tools developed by users are also supported.

  • 提供內置的自助查詢工具,可直接通過圖形化界面建立數據分析,用戶可通過配置數據模型、過濾條件、結果字段等查詢條件,即可獲得相應的數據分析結果報表

    Built-in query tools can help to perform data analysis with graphic interfaces. Users can set query conditions such as data model, filter condition and result field, andacquire relevant result reports of the data analysis

  • 提供多樣化的數據分析呈現圖表,如地圖工具、數據報表、數據腦圖、數據報告等,依據數據可視化的科學方法以合理的方式為用戶呈現分析結果,極大提升分析結論的可讀性

    Diverse data analysis charts are provided, such as maps, data reports, data mind maps, etc. Analysis results are presented in the scientific and reasonable way of data visualization, contributing to much greater readability

  • 支持數據分析過程的協作共享,從源數據到得出分析結果的過程中,可分別由不同的用戶分工協作,其中可能包含數據管理員、分析人員、一線業務人員等等,讓各類用戶均能夠參與到數據分析的過程中來,並以社交化的方式分享數據分析報告

    Collaboration and sharing is allowed during data analysis. In the process of getting a result from source data, users can coordinate and distribute responsibilities. Persons involved might include data managers, analysts, first-line business personnel, etc., which allows participation of various users in the process of data analysis and enables the sharing of data analysis reports in a socialized manner


應用場景

Application Scenarios

基於上文中介紹的品高數據方案的功能特性和創新點,以下列舉三個適合於應用數據湖方案的應用場景。

In accordance with the characteristics and innovations of Bingo data solutions, 3 scenarios suitable for data lake solutions are listed as follows.


場景1:跨組織邊界的數據共享

Scenario 1: Data Sharing Across Organizational Boundaries

隨著大數據的深入發展,各企業、政府紛紛建設了大數據平臺,對於提升企業生產效率、銷售模式以及政府治理水平等起到了有效的推動,數據應用不再侷限於自身擁有的數據,要求通過多方數據共享後的匯聚分析實現更大力度的數據創新,進而促進企業或政府組織的治理質量提升。聯繫品高雲家的小表妹(ID:pingaoyunzzm)瞭解更多。

As big data further develops, enterprises and governments have successively established their big data platforms, which contributes tothe improvement of the enterprises’ production efficiency and sales patternsand the governments’ governance. The applications of data are no more confinedto one’s own data, and the convergence analysis following data sharing among multiple parties can realize greater data innovation and improve the governance of enterprises or government organizations.

傳統解決方案存在的問題

Problems with the Traditional Solutions

  • 難實現異構技術融合

    Difficulties in Achieving Heterogeneous Technology Convergence

組織機構產生的數據複雜多樣,數據匯聚難度大。Hadoop技術僅能夠解決單個部門的數據存儲和處理,但無法解決跨組織邊界的技術融合和共享權限問題。跨組織邊界的大數據技術路線不一,技術融合難度大。

Complicated and diverse data generated from organizations result in huge difficulty of data convergence. Hadoop technology is able to settle the data storage and processing of a single department, while unable to address issues over data integration and sharing rights across organizations. Big data technical routes across organizational boundaries are varied, which causes huge difficulty in technology integration.

  • 數據共享模式存在不足

    Defects of Data Sharing Modes

跨組織邊界的數據共享開放常見模式有數據查詢接口、FTP 文件交換、大數據交易所等。

Common modes of data sharing across organization boundaries include data query interface, FTP file exchange, big data exchange, etc.

  • FTP 文件交換存在安全性弱、交換性能差、數據主權難界定、需拷貝數據等問題

    FTP file exchange is weak in terms of security and exchange performance. Here, data sovereignty is hard to define, and data has to be replicated

  • 大數據交易所缺乏數據匯聚基礎,難以滿足大量數據的關聯碰撞

    Big data exchange is in lack of a basis for data convergence, and is hard to fulfill the association and collision of massive data

  • 缺乏對運營體系的支持

    Lack of Support for Operation Systems

大數據平臺往往重技術、輕運營、輕質量,導致大數據平臺無法可持續發展,有必要從數據評價、數據質量和數據開放指數建立全面的數據運營體系,保障數據共享的可持續發展。

Big data platforms often pay more attention to technologies than their operation and quality, which results in its difficulty in sustainable development. It is essential to create a comprehensive data operating system by referring to data’s assessment, quality and index of opening, and protect the sustainable development of data sharing.

應對與解決

Coping Solutions

針對以上問題和需求,品高數據湖方案通過深度融合雲計算和大數據技術,以數據存儲為基礎,通過在本文所述的數據集成、數據開發、數據管理、數據消費四個方面的創新能力,解決組織部門之間、跨組織、跨行業的數據共享和開放,幫助組織構建可持續、健康的數據生態鏈,通過數據關聯進一步挖掘數據價值,推動數據創新。

Aiming at problems and demands listed above, on the basis of data storage, by integrating cloud computing and big data technology, and by taking advantages of its innovative capabilities on the integration, development, management and consumption of data, Bingo Data Lake solutions settle the data sharing and opening across departments, organizations and industries, help organizations to create a healthy and sustainable data ecological chain, and further excavate data values through data association so as to promote data innovation.


場景2:促進基於數據的產學研的合作

Scenario 2: Promoting Production-Study-Research Cooperation Based on Data

行業生產數據與科研之間的矛盾

Contradiction between Production Data and Research

政府機構、大型企業擁有大量生產數據,但技術儲備和算法模型較弱,而高校、科研機構有技術、有算法模型,苦於沒數據。

Government agencies and large scale enterprises possess massive production data but weak technical reserves and algorithm models, while universities and research institutions turn out to be the opposite.

利用數據湖建立生產和科研的橋樑

Building A Bridge between Production and Research with A Data Lake

基於上述問題,可通過數據湖將行業生產數據脫敏後存儲到數據湖,開放給科研機構、高校進行研究性探索,同時,研究成果可反饋應用於企業,從而有效促進基於數據的產學研合作。聯繫品高雲家的小表妹(ID:pingaoyunzzm)瞭解更多。

On account of the problems above, production data can be desensitized through the data lake, stored in it, and opened to research institutions and universities for research purposes. Meanwhile, research results can in turn be applied by enterprises, which may effectively promote the Production-Study-Research Cooperation based on data.


場景3:聯邦數據湖

Scenario 3: Federated Data Lake

跨組織的數據集中存在安全和信任問題

Security and Trust Issues in Cross-organizational Data Collection

在數據湖的建設過程中,會常常遇到跨企業間、不同政府部門間的跨組織數據湖建設。如果通過統一的數據湖來集中管理所有數據,數據的採集將會變得比較困難,包括組織間的數據互信、數據主權、數據安全等一些列問題。

During the constructions of data lakes, we will frequently encounter cross-organizational constructions across enterprises or different government departments. If we manage all data with a unified datalake, data collection will become difficult, and issues like mutual trust, sovereignty and security of the data will occur.

利用聯邦數據湖構建開放的數據生態

Data Ecology Based on Federated Data Lakes

應對上述情況,品高數據湖方案提供去中心化的聯邦數據湖,平臺基於聯邦數據湖實現跨部門、跨組織的數據共享,並通過數據開放平臺,將數據相關的目錄、工具、服務、模型開放出來,各組織和數據模型相關軟件開發商均可在上面進行數據協作,幫助企業、政府構建可持續發展的數據生態鏈。

To address the situation, Bingo Data Lake solutions offer federated data lakes that are decentralized. The platform based on federated data lakes can realize data sharing across departments and organizations. Relevant catalogs, tools, services and models can be opened for all organizations and relevant software developers to collaborate, thus helping enterprises and governments to establish a healthy and sustainable data ecological chain.


聯繫我們

如想了解更多品高雲數據湖或索取產品文檔,請聯繫品高雲家的客服小表妹!添加她為好友,任何需求一鍵直達。

基於數據湖架構的大數據平臺:品高雲與Gartner聯合報告正式上線


分享到:


相關文章: