# Global mean nitrogen recovery efficiency in croplands can be enhanced by optimal nutrient, crop and soil management practices

### NUEr definition

Researchers have assigned different definitions for N use efficiency, thus requiring a clear definition when used. The two key approaches that are used to define and quantify N use efficiency include the N difference approach and the N balance approach^{4}. In the N difference approach, generally used in agronomic studies, the N use efficiency is calculated as the difference in N uptake in total biomass (grain and crop residues) in a fertilized and unfertilized plot, divided by the fertilizer N input. This term is generally denoted as fertilizer N recovery efficiency. In the N balance approach, being the most widely used approach in environmental studies, the N use efficiency is calculated as the ratio of N harvested by crops divided by the total N input (including not only the N input by fertilizer but also other sources, i.e., N fixation and N deposition)^{2}. In this study, we assessed the N use efficiency based on the N difference approach (the N recovery efficiency), since this is most relevant for agricultural practices. Also, the bulk of N use efficiency data collected in agronomic studies are based on an assessment of total aboveground plant N uptake in fertilized and unfertilized plots, while observations of N deposition and fixation, permitting calculation of total N input, are lacking. Some studies reporting only grain yield increase in response to added N fertilizer were not included. In our study the defined N use efficiency, denoted as NUEr to make the link to N fertilizer recovery, was thus calculated, according to Dobermann^{64}:

$${{{{\rm{NUEr}}}}}=\frac{{{{{{\rm{NUP}}}}}}_{{{{{\rm{fertilized}}}}}}-{{{{{\rm{NUP}}}}}}_{{{{{\rm{unfertilized}}}}}}}{{{{{\rm{N}}}}}_{{{{{\rm{rate}}}}}}}\times 100,$$

(1)

where, NUEr is expressed as a percentage (%), NUP_{fertilized} and NUP_{unfertilized} is the N uptake by aboveground plants (kg N ha^{-1}) in the fertilized treatment and unfertilized control during the experiment, respectively and N_{rate} is the rate of N fertilizer applied (kg N ha^{-1}).

### Data collection

#### Collection of meta-analytical studies

In December 2021, we performed a literature search for meta-analytical studies on the effect sizes for NUEr or N uptake in response to changes in nutrient, crop and soil management. Searches were performed using Web of Science (https://www.webofscience.com) with search terms: NUEr, N uptake, nutrient management, crop management, soil management and meta-analysis (Supplementary Note 1). The meta-analytical studies included met three criteria: (1) linked to at least one management practice to the impact of NUEr or N uptake; (2) limited to management of main cereal croplands (maize, wheat and rice), excluding grasslands and forests; and (3) providing estimates based on field studies, thus excluding laboratory or incubation studies. When meta-analytical studies presented a summary of previous analyses, only the most recent study was selected. This search and selection resulted in the inclusion of 29 studies (Supplementary Fig. 1). Detailed information about these studies is given in Supplementary Data 1, including bibliographic details, crop types, management practices and response variables, with a summary in Supplementary Table 1. Supplementary Table 1 describes the management methods and controls, divided over (1) nutrient management: enhanced efficiency fertilizer, combined fertilizer, organic fertilizer, mineral fertilizer, fertilizer placement, fertilizer rate and fertilizer timing; (2) crop management: residue retention, cover cropping and crop rotation; and (3) soil management including zero and reduced tillage. For each management practice, the control (treatment) situation is mentioned to which the practice is compared is given in Supplementary Table 2.

#### Collection of the primary data

Relevant nutrient, crop and soil management data and site conditions were retrieved from the 407 primary studies based on the 29 meta-analytical studies. This resulted in 2436 paired observations for maize, wheat and rice (Supplementary Data 2). From these studies the following variables were extracted: (1) reference details including author, title and publication year; (2) latitude and longitude; (3) experiment duration; (4) site-specific soil properties and climatic conditions; (5) crop type; (6) number of replicates; (7) management practices applied (in predefined nutrient, crop and soil management classes); (8) mean NUEr in experimental and control treatments; and (9) practices of variation (including standard error, 95% confidence interval or standard deviation). When replicate numbers were not reported, the number of replicates of the primary studies was estimated as 3. If studies did not provide standard deviations (SD) or standard errors, the SD was estimated from the mean coefficient of variation (CV) of other studies in the database^{65} as:

$${{SD}}_{{NUEr},i}={{CV}}_{{NUEr}}\times {{NUE}}_{r,i}\times 1.25,$$

(2)

where, \({{CV}}_{{NUEr}}\) is the mean CV of NUEr values provided.

In most of the primary studies, information on site conditions that might have affected the impacts of practices, i.e., climate and soil properties, was lacking. To be consistent, all those data were derived from the given longitude and latitude, using climate data from CRU (Climate Research Unit) database (http://www.cru.uea.ac.uk/data), i.e., MAT and MAP; and soil properties from Soil Grids (http://www.isric.org/explore/soilgrids), i.e., clay content, SOC and soil pH.

An overview of the data collected is given in Supplementary Table 3. The sampling sites and locations of the 407 primary studies are shown in Fig. 1. The distribution is shown in Supplementary Fig. 2. Most of the study sites were located in Asia (77%), North America (14%), and Europe (4%), and less in South America (2%), Africa (2%) and Australia (1%) (Supplementary Data 2). Maize, wheat, and rice accounted for 35, 30, and 35% of the total primary studies, respectively. The most evaluated management practices were enhanced efficiency fertilizers (31%), combined fertilizer (15%), and fertilizer rate (15%) followed by crop residue (9%), fertilizer timing (8%), fertilizer placement (6%), zero tillage (6%), organic fertilizer (5%), cover cropping (2%), reduced tillage (2%) and crop rotation (1%). An overview of the NUEr mean, variance, and range for the control and treated plots, and the variation in site conditions is given in Supplementary Table 4. The mean NUEr of experimental treatments (39%) was 6% higher than the mean of control treatments (33%). The majority of the studies had NUEr values ranging between 20 and 60% for the control plots in which no additional practices had been taken to increase NUEr. About one-fifth of the NUEr values were under 20%, and only one-tenth of the NUEr values was over 60% (Supplementary Data 2). Site conditions of the analyzed studies cover the main range of variability in global agricultural regions, with MAT ranging from -0.6 to 29 °C, MAP from 45 to 2330 mm, soil organic carbon content from 2.7 to 80 g kg^{-1}, soil pH from 4.5 to 8.5 and clay content from 8.8 to 53%.

### Data analysis

#### Meta-model integrating the published meta-analytical studies

Multiple observations or treatments were collected in meta-analytical studies, which means that data points were correlated. When the same management practices were reported by multiple meta-analyses, the overall mean change in NUEr due to the measure and the associated standard error were calculated by the following equations to establish one meta-model from the assessed meta-analytical studies^{52}.

$$\bar{x}=\frac{\sum ({x}_{i}/{\sigma }_{i}^{2})}{\sum (1/{\sigma }_{i}^{2})}$$

(3)

and

$${\sigma }_{\bar{x}}=\frac{1}{\sqrt{\sum (1/{\sigma }_{i}^{2})}},$$

(4)

where, \(\bar{x}\) is the weighted mean, \({\sigma }_{\bar{x}}\) is the standard error of weighted mean, \({x}_{i}\) is the individual mean from the effect size reported, and\({\sigma }_{i}^{2}\) is the individual variance from the effect size reported.

#### Assessing mean effects of practices on NUEr derived from original field studies

In order to conduct a meta-regression on the original experimental data derived from the 407 primary studies, we first calculated the effect sizes and corresponding variances of the primary studies using three methods (also called effect sizes) based on the means, standard deviations and number of repetitions of the recorded NUEr values^{63,66,67}.

The log-transformed ratio of means (ROM) was calculated as:

$${{{{\mathrm{ln}}}}}\, {RR}={{{{\mathrm{ln}}}}}\left(\frac{{X}_{t}}{{X}_{c}}\right),$$

(5)

where, \({X}_{t}\) and \({X}_{c}\) are the mean NUEr in the treatment and control groups, respectively.

The corresponding variance was calculated as:

$${V}_{{{{{\mathrm{ln}}}}}{RR}}=\frac{{{s}_{t}}^{2}}{{n}_{t}{{X}_{t}}^{2}}+\frac{{{s}_{c}}^{2}}{{n}_{c}{{X}_{c}}^{2}}$$

(6)

where, \({n}_{t}\) and \({n}_{c}\) are the number of the treatment and control, respectively, and \({s}_{t}\) and \({s}_{c}\) are the standard deviations of the treatment and control, respectively. The change in relative NUEr (as %) compared to the control due to a management measure was subsequently calculated as:

$${{{{\rm{Relative}}}}}\,{{{{\rm{change}}}}}\, \left(\%\right)=\left({e}^{{{{{\mathrm{ln}}}}}{RR}}-1\right)\times 100$$

(7)

The raw mean difference (MD) was calculated as:

$${MD}={X}_{t}-{X}_{c}$$

(8)

The corresponding variance was calculated as:

$${V}_{{MD}}=\frac{{{s}_{t}}^{2}}{{n}_{t}}+\frac{{{s}_{c}}^{2}}{{n}_{c}}$$

(9)

The standardized mean difference (SMD) was calculated as:

$${SMD}=\frac{({X}_{t}-{X}_{c})}{{{SD}}_{p}}$$

(10)

and

$${{SD}}_{p}=\sqrt{\frac{\left({n}_{t}-1\right){{s}_{t}}^{2}+\left({n}_{c}-1\right){{s}_{c}}^{2}}{{n}_{t}+{n}_{c}-2}}$$

(11)

where, \({{SD}}_{p}\) is the pooled within-group standard deviation.

The corresponding variance was calculated as:

$${V}_{{SMD}}=\frac{{n}_{t}+{n}_{c}}{{n}_{t}\times {n}_{c}}+\frac{{{SMD}}^{2}}{2({n}_{t}+{n}_{c})}$$

(12)

Given that the collected data came from studies applying different research methods, there is non-independence and heterogeneity among the effects^{62,65}. We accounted for the non-independence by using multivariate meta-modeling with restricted maximum-likelihood estimation, as implemented in Metafor^{65}. Paper number was used to specify the random-effects structure of the model. NUEr observations of the primary studies were assumed to be independent whereas effects within each study received correlated random effects assuming a symmetric compound structure. Random-effects models can estimate the distribution of individual effect sizes of means, residual heterogeneity and sampling error^{65}. It calculates the mean effect size as a weighted mean of individual effect sizes, using the inverse of the sum of the between-study variance (due to variation in experimental conditions) and within-study variance (due to sampling error) as weights^{66}.

To compare the results of ROM, MD, and SMD methods, all results are expressed as changes in absolute NUEr. For the ROM method, the average NUEr in the control group (\({\bar{X}}_{c}\)) for different management practices was calculated firstly, and then the absolute NUEr was calculated based on the relative NUEr:

$${{Absolute}\, {change}}_{({ROM})}\left(\%\right)={Relative}\,{change}\left(\%\right)\times {\bar{X}}_{c}$$

(13)

For the SDM method, the average pooled within-group standard deviation \(\left({\overline{{SD}}}_{p}\right)\) for different management practices was initially calculated, and then the absolute NUEr was calculated as:

$${{Absolute}\, {change}}_{({SMD})}\left(\%\right)={SMD}\times {\overline{{SD}}}_{p}$$

(14)

### Assessing impact of site conditions controlling NUEr from original field studies

To evaluate the impact of management practices and site conditions (MAP, MAP, clay content, SOC, and soil pH) on NUEr derived from original field studies, a main factor analysis was performed initially to assess their overall impact. The principle behind this approach is based on generalized conclusions derived from a large number of field studies, allowing the identification of broadly applicable cause-effect relationships. An analysis of variance was then done to evaluate the contribution of each of the assessed management practices and site conditions on the variation of the NUEr^{65}, combined with an analysis of both Akaike’s information criteria (AIC) and the *p* value.

Since the impact of nutrient, crop, and soil management practices on NUEr may interact, we analyzed the main and all two-way interactions between management practices and site conditions using a mixed effects model with interaction terms^{62}:

$${y}_{i}={\beta }_{0}+{\beta }_{1}{x}_{i1}+{\beta }_{2}{x}_{i2}+{\beta }_{3}{x}_{i1}{x}_{i2}+\ldots+{u}_{i}+{e}_{i}$$

(15)

where, \({y}_{i}\) is the observed effect size of NUEr, \({x}_{i1}\) is the value of the first moderator variable for the *i*th study and \({x}_{i2}\) is the value of the second moderator variables for the *i*th study, \({\beta }_{0}\) is a regression coefficient representing the intercept, \({\beta }_{1}\) is a regression coefficient indicating how the average true effect size changes for one unit increase in \({x}_{i1}\), \({\beta }_{2}\) is a regression coefficient indicating how the average true effect size changes for one unit increase in \({x}_{i2}\), \({u}_{i}\) is the variance of the true effect (residual heterogeneity) of study i, \({e}_{i}\) is the sampling error of study i, and \({x}_{i1}{x}_{i2}\) is the interaction term with coefficient \({\beta }_{3}\).

To avoid overfitting the regression model, we first checked for unacceptably high predictive correlations before fitting the model (Supplementary Fig. 3). We assessed the impact of each factor and its interaction with other variables using analysis of variance. Pseudo-*R*^{2} values (McFadden’s method) and AIC were used to compare regression models. The best model had high pseudo-*R*^{2} and low AIC values. We also checked the amount of residual heterogeneity according to the Q_{E} output of the *rma.mv* function in R 4.2.2 software^{65}. Q_{E} tests show whether the variability in the observed effect size (for which the moderators do not account) is larger than the expected sampling variability only, so Q_{E} represents the heterogeneity that cannot be explained by the model. Smaller values for Q_{E} reflect a better model performance.

#### Assessing spatial variation and global potential to increase NUEr and its uncertainties

The model developed to assess changes in NUEr due to agronomic practices as a function of site conditions (Eq. 15) was used to make a spatial explicit assessment of the impacts of those measures by applying the derived empirical model to all croplands around the globe using global data sets on site conditions. We thus estimated the global potential for NUEr improvements on a 0.5 ×0.5 degree resolution using existing global data sets of: (1) N inputs by fertilizer and manure from PANGAEA^{68} (Data Publisher for Earth & Environmental Science) database (https://doi.org/10.1594/PANGAEA.871980), (2) climate data from CRU (http://www.cru.uea.ac.uk/data), including MAT and MAP, (3) land use data from the SPAM (Spatial Production Allocation Model) dataset^{69} (https://www.mapspam.info/data); and (4) soil properties from Soil Grids (http://www.isric.org/explore/soilgrids) including clay content, SOC and soil pH. We mapped both the average NUEr increase and the associated uncertainties, expressed by accounting for the variance of the effect of studies (Eq. 15) while neglecting the uncertainty in site data (N inputs, climate data, land use data, and soil properties). The uncertainty in site data can locally be large but levels out at the coarse 0.5 ×0.5 degree resolution as being used in this study. Uncertainties in predicted NUEr change were given by calculating the 95% confidence interval around the predicted change in the mean NUEr, being constructed based on the critical values from a standard normal distribution (i.e., 1.96 for 95%) where the predicted values are based only on the fixed effects (betas from Eq. 15) of the model^{63,70}.

#### Other uncertainties in the impacts of measures on the spatial variation in NUEr increase

The 407 primary studies included in this study were conducted at the plot scale to quantify the NUEr in the soil-plant system. For crop farms, the NUEr at plot scale can be considered representative for the farm scale, assuming that the current agricultural management practices at the experimental locations are equal to the traditional management practices of the farmers. Also, while the NUEr definition at plot scale and farm scale differs for a livestock farmer, it is similar for a crop farmer, with N inputs and N outputs from soils belonging to a farm being equal to N fertilizer inputs and crop N outputs to and from the farm^{71}.

Since the variations in NUEr changes at plot scale are based on the local climate conditions, soil properties, and traditional agronomic practices, the results can also be used to predict the variation at a plot (farm) scale and NUEr at a global scale. However, there are significant uncertainties in this upscaling due to the unevenly distributed data sets (the data set in this study is mainly concentrated in USA and China, while other regions are relatively scarce) that we used in assessing the impacts of site properties on management impacts and the uncertainties in global data sets on N inputs by fertilizer and manure, climate data, land use data, and soil properties. Additional studies are needed to assess the impact of management practices on the NUEr in the Animal‐Plant‐Soil System and in the Agro-Food system, in view of N losses in the crop-animal system (from feed to animal products) and in the total food chain^{72}. The latter information is also needed to support policies and actions for sustainable agricultural management.

### Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.