What is NMR spectroscopy?
Nuclear Magnetic Resonance (NMR) spectroscopy is an analytical technique used by chemists for quality control and research. It allows to determine the content and purity of a sample as well as molecular structures of unknown compounds. In another words, NMR can either be used to quantitatively analyse mixtures containing known compounds or identify entirely unknown compounds. Furthermore, NMR is utilised to study chemical and physical properties at the molecular level such as conformational exchange, reactivity, solubility and diffusion. In order to achieve that, hundreds of various NMR techniques were developed over the years. Due to high versatility, accuracy and reproducibility, NMR became absolutely essential analytical technique for modern chemistry research.
NMR spectrometers with very strong liquid helium-cooled superconducting magnets are relatively expensive and not easy to maintain. Therefore, they are usually placed in large central laboratories owned by universities or big private companies. However, the recent boom of less expensive bench-top instruments using permanent magnets and lower resolution resulted in expansion of NMR spectroscopy in some smaller and unusual niches.
NMR spectrum as a finger print
NMR spectrum can be seen as a finger print of the electronic structure of a molecule and its individual functional groups. Therefore, compounds can be identified by comparing their NMR spectra against spectral libraries of already known compounds. Unlike fingerprints, NMR spectra are highly predictable and molecular structure of unknown compound can be deduced entirely from NMR spectroscopy data. However, solving such NMR puzzle is not straightforward, multiple correlation NMR experiments are usually required and complexity increases dramatically with the size of the molecule.
NMR spectrum as a digital asset
A NMR spectrum in raw digital format can represent a substantial value. Firstly, recording NMR spectra requires significant amount of resources in form of expensive instruments and consumables (cryogens and deuterated solvents) and labour of highly trained employees who run the experiments and maintain the instruments. Furthermore, labour that goes into obtaining the actual substance by synthesis and/or purification can be also considerable. One can see the value from the other perspective as well. Having access to verified source of NMR spectra in raw format can enable faster and more confident identification of substances either in pure state or complex mixtures. That can bring inestimable cost savings for chemical and pharmaceutical industry and chemistry research in general.
What is the problem?
In last decade, amount of NMR data that large NMR facilities produce has dramatically increased due to high efficiency of fully automated instruments. However, this large quantity of data is not accordingly stored and shared despite of its considerable intrinsic value. NMR spectroscopy is lacking a global data depository that would be an equivalent of CCDC for X-ray crystallography data. The status quo has not been changing much despite of the open access data enforcement of research councils and initiatives like Go Fair and NMReDATA. We believe that the core of the issue is at the bottom of the pyramid. Data management in academic NMR laboratories is rather poor as the raw data are usually stored on a network drive without any metadata and search facilities. When the research is concluded, finding and uploading data into research data repository like Figshare or Zenodo becomes a troublesome commitment that researchers rather avoid at any cost.
NMR laboratories in industry have usually better data management but the data remains in isolated silos even after confidential embargo is ceased as there is no platform where NMR data could be traded as a digital asset.
How do we aim to solve it?
NOMAD tries to solve the problem from the bottom up by providing smart and complete solution for NMR laboratory data management that enables to store NMR data securely and robustly together with provenance meta-data and facilitates sharing and audibility.
NOMAD database automatically captures and stores data generated by NMR instruments and provides seamless tools for annotating, sharing and editing the data through easily accessible web browser interface. Connecting NOMAD to an Open Access Data repository would allow to streamline the whole NMR laboratory workflow from conception of experiment to publication of the research results.
NOMAD control is an optional module that offers a centralised dashboard for the management of large automated NMR laboratories that operate 24 hours per day, 7 days per week and could possibly act as additional driving force for adoption of NOMAD system.
The current version of NOMAD database has passed the proof-of-concept stage and can be delivered as a dockerised installation on a hardware provided by client/collaborator. These NOMAD installations would serve as local data management solution and low threshold gateway for open access raw NMR data deposition. In the same time, a portion of hardware resources could be contributed into formation of peer-to-peer network of nodes that would facilitate exchange of raw NMR data between participants of the network. In the long run, the NOMAD distributed database could lay down a base for a global open access NMR data repository that would collate the data from individual NOMAD nodes and also through SaaS portal (Dropbox like service) which could considerably lower the threshold for adoption. The community of NMR spectroscopists has a history of great collegiality that spans across the whole globe. Therefore, we strongly believe that such lightweight architecture of raw NMR data repository supported by NMR community has much higher chance for success than any centralised solution.
Since decentralised databases has shown to be able to maintain privacy and ownership of the data the P2P network that we outline here could also potentially serve as raw NMR data market places for private companies.