‘It’s all about the data’
For those close to the Atmospheric Radiation Measurement (ARM) user facility, an unofficial slogan might be, “It’s all about the data.” But the users who integrate ARM data in their research may say, “It’s all about the quality of the data.”
Now celebrating its 20th anniversary, the ARM Data Quality (DQ) Office serves as the gatekeeper of ARM data collected around the world. The DQ Office is based at the University of Oklahoma Cooperative Institute for Mesoscale Meteorological Studies.
According to ARM DQ Office Manager Randy Peppler, the DQ team’s principal mission is to identify data anomalies based on quality control procedures and tools. “Two decades ago, ARM had the foresight not only to understand the importance of data quality, but the need to produce data in a coordinated way across the organization,” says Peppler, who has been with the DQ Office since it was created in July 2000 on the recommendation of ARM’s infrastructure review of 1999.
“Today, ARM’s data program is strong, and ARM users can feel very confident about the data they’re using in their research,” says Peppler.
Building a Bigger Picture of Data Quality
In the office’s early days, recalls ARM DQ Office Associate Manager Ken Kehoe, each ARM site had its own methodology for ensuring data quality. “It was more than procedures,” he says. “The challenge was standardizing the way that problems were reported and resolved.”
Kehoe adds that the DQ team used early internet technologies to share information and build a bigger picture of ARM data quality.
Twenty years since its formation, the DQ team comprises five staff members and 10 to 12 undergraduate data quality analysts from the University of Oklahoma School of Meteorology. To ensure the best and most accurate data collected and distributed by ARM, the DQ team scrutinizes more than 350 datastreams and 5,000 variables every week. (Each datastream is made up of many measurements called variables.)
This mission-critical work identifies potential issues with instruments and, ultimately, helps provide the best possible measurements to ARM users.
“It’s hard to overemphasize how important it is that we provide high-quality data to the scientific community,” says ARM Technical Director Jim Mather. “Having that review by the staff at the Data Quality Office is vital to ARM users and ARM.”
It’s a big responsibility that Peppler and his team embrace. Maintaining data quality for an organization with the size and complexity of ARM is not simple. The DQ team files a data quality assessment weekly on every datastream generated by instruments at ARM fixed-location atmospheric observatories, mobile facilities, and field campaigns. The students, says Peppler, are part of the first line of defense.
What happens when the team discovers a data anomaly?
Potential anomalies are recorded in the ARM problem reporting systems and tracked by the DQ Office and instrument mentors to resolution. These data quality assessment reports are shared with instrument mentors to assist in the corrective maintenance of the instruments. Issues that affect data quality are then communicated to the science community through Data Quality Reports (DQRs). DQRs are visible while browsing for data, received with data orders, and accessible through the DQR web service. ARM users are also encouraged to report any concerns or questions they have about data quality to the Data Quality Office.
In addition to DQRs, information related to data quality is communicated to ARM data users through instrument handbooks found on each instrument’s web page and Embedded Quality Control flags found in most data files as ancillary variables.
Flexible and Innovative
According to Peppler, it is very unusual for a contractual arrangement—in this case, ARM and the University of Oklahoma—to last so long. “It is a great testament to the professionalism and dedication of the people here in the DQ Office and at ARM,” he says.
“The DQ Office has always had the advantage of being flexible and innovative, which allows the staff to work closely with instrument mentors to catch and resolve any problems quickly,” says Adam Theisen of Argonne National Laboratory in Illinois. He worked in the DQ Office for nearly a decade until taking on the role of ARM instrument operations manager in 2018.
Examples of innovation, says Peppler, include DQ team-developed software tools that help to identify potential data anomalies. The most recent are Python-based software platforms. DQ-Explorer applies algorithms to raw data, allowing data analysts to review measurements and time plots of data in a dashboard platform—and flag potential issues. Two other DQ Office software products, DQ-Zoom and DQ-Plotbrowser, give data analysts precise views of specific time periods, trends, and more.
“The work of the DQ Office impacts nearly every part of ARM,” says Theisen. “The DQ Office has always benefited from a core group of experts who put a lot of work into standardizing and automating data quality processes, freeing time to provide the critical work on data analysis.”
# # #ARM is a DOE Office of Science user facility operated by nine DOE national laboratories.