10.16 可持續的太字節級海洋圖像分析的採集、整理和管理工作流

論文標題:An acquisition, curation and management workflow for sustainable, terabyte-scale marine image analysis

作者:Timm Schoening, Kevin Köser, Jens Greinert

數字識別碼:10.1038/sdata.2018.181

光學成像技術是海洋研究中的一種常用技術。潛水機器人、拖曳式攝像機、投放式攝像機、電視指導採樣裝置,均可生成水下環境的圖像數據。現在,一些先進的技術如4K攝像機、自動機器人、高容量電池和LED照明等使得系統的光學監測能夠在大空間尺度和更短的時間內進行,且數據採集量和採集速度都有所增加。不斷增多的船隊和新興的自主航行器也同時擴大著大數據集,這進一步增加了圖像數據採集量和採集速度。大量的數據需要自動化處理工具以最大程度獲取其中的信息。系統的數據分析主要得益於經校準的、地理相關的數據加上清晰的元數據描述,對機器視覺和機器學習尤是如此。因此,採集到的寶貴數據必須進行存檔,並儘快進行整理、備份、公開。

在《科學數據》發表的An acquisition, curation and management workflow for sustainable, terabyte-scale marine image analysis一文中,來自基爾GEOMAR亥姆霍茲海洋研究中心的TimmSchoening及同事針對可持續的海洋圖像分析,提出了一個完整的工作流程。作者就數據採集、整理和管理流程提出了意見,並將其應用於由自主水下航行器獲取的多太字節(TB)深海數據集的處理案例中。

可持续的太字节级海洋图像分析的采集、整理和管理工作流

圖1:圖像數據從採集到整理和管理的工作流程示意圖。

摘要:Optical imaging is a common technique in ocean research. Diving robots,towed cameras, drop-cameras and TV-guided sampling gear: all produce image data of the underwater environment. Technological advances like 4K cameras, autonomous robots, high-capacity batteries and LED lighting now allow systematic optical monitoring at large spatial scale and shorter time but with increased data volume and velocity. Volume and velocity are further increased by growing fleets and emerging swarms of autonomous vehicles creating big data sets in parallel. This generates a need for automated data processing to harvest maximum information. Systematic data analysis benefits fromcalibrated, geo-referenced data with clear metadata description, particularly for machine vision and machine learning. Hence, the expensive data acquisition must be documented, data should be curated as soon as possible, backed up and made publicly available. Here, we present a workflow towards sustainable marine image analysis. We describe guidelines for data acquisition, curation and management and apply it to the use case of a multi-terabyte deep-sea data set acquired by an autonomous underwater vehicle.

期刊介紹:Scientific Data (https://www.nature.com/sdata/) is a peer-reviewed, open-access journal for descriptions of scientifically valuable datasets, and research that advances the sharing and reuse of scientific data. Scientific Data welcomes submissions from a broad range of research disciplines, including descriptions of big or small datasets, from major consortiums to single research groups. Scientific Data primarily publishes Data Descriptors, a new type of publication that focuses on helping others reuse data, and crediting those who share.

The 2017 journal metrics for Scientific Data are as follows:

•2-year impact factor: 5.305

•5-year impact factor: 5.862

•Immediacy index: 0.843

•Eigenfactor® score: 0.00855

•Article Influence Score: 2.597

•2-year Median: 2


分享到:


相關文章: