Introduction
Globally, only about 30 % of cultivated cacao comes from breeding programs and other traceable sources. This limited diversity increases variability in crop productivity, bean quality, and susceptibility to climate change (Cilas & Bastide, 2020; Farrell et al., 2018; Motamayor et al., 2008). Despite the steadily increase in demand for cacao every year (End et al., 2021; Hütz-Adams et al., 2002), the chocolate industry faces multiple threats to supply of its key ingredient, cacao grains.
However, the lack of genetic diversity in current cacao cultivated varieties is often overlooked, even though it poses a threat to the entire cacao value chain. To address this issue, collecting, safeguarding, and facilitating access to cacao genetic diversity is essential. This will provide cacao breeders, scientists, and the industry with greater opportunities to access materials to produce varieties that comply with production and quality requirements, and ultimately to make these improved varieties available to small-scale farmers. A robust strategy to support regional and national breeding programs should focus on the neglected and underutilized genetic richness of cacao grown at the farm level (Ceccarelli et al., 2022; End et al., 2021).
Cacao farming in Nicaragua covers about 10,000 hectares and involves more than 11,000 small farmers which produce 6,600 metric tons annually, representing more than USD$ 5 million in revenues (Wiegel et al., 2020). Nicaragua is ranked 25th as producing country, and since 2015 the country has been categorized as fine or aroma supply origin (International Cocoa Organization, 2015).
A large proportion of the cacao genetic material currently grown by small farmers comes from hybrid open-pollinated seeds, which are characterized by high variability in terms of yield, tolerance to pests and diseases, and bean quality (Ji et al., 2013). Other producers were supplied with planting materials of higher quality sourced from the cacao research center ''El Recreo'', managed by the Nicaraguan Institute of Agricultural Technology (INTA), and distributed via several cacao cultivation projects over a decade (1990-2000). Some farmers multiplied their own seeds from the best-performing trees, but the agronomic performance of these cacao plantations is below expectations. Additionally, Trinitarian clones such as UF-296, UF-613, UF-667, and ISC-95 were widely distributed during the 90s (Mata-Quirós et al., 2017).
Between 2005-2013, new planting material, including CC-137, PMCT-58, and three new clones that are descendants of UF-237 (CATIE-R1, CATIE-R4, and CATIE-R6), was introduced in to the country due to their high bean quality, productivity, and tolerance to moniliasis and black pod (Phillips-Mora et al., 2012). A small proportion of farmers, mainly medium-size farmers, bought or reproduced their own planting material using promising clones or varieties available at private clonal gardens nearby.
Since 2015, nursery companies such as ECOM have used advanced micro-grafting technology to mass-produce cacao plants distributed among ongoing cacao projects nationwide (ECOM Agroindustrial Corp. Lt, 2020). Although certified at the nursery and multiplication process level, no private provider has distributed duly certified cacao genetic material (Ceccarelli et al., 2022).
In 2018, the cacao research and breeding program of INTA developed a new productive clone named ''PACAYITA'', which has been widely distributed. However, no data on agronomic and yield performances are currently available (Instituto Nicaragüense de Tecnología Agropecuaria, 2018; Martorell Mir, 2020).
In Nicaragua, the origin of several clones is unknown. Mislabeling of available genetic resources is one of the most significant problems within germplasm collections and clonal gardens, with some studies estimating labeling errors of up to 30 % in international collections (Turnbull et al., 2017).
Previous studies have shown that the genetic makeup of cacao in Nicaragua is mainly composed of Trinitarian and Criollo materials (Trognitz et al., 2011). However, these studies have limited geographic coverage or focused on a few germplasm collections (Herrera-García et al., 2015; Ruiz et al., 2011), leaving the genetic diversity of cultivated cacao across the country largely unexplored.
Recent advances in DNA sequencing technology, particularly Single Nucleotide Polymorphism (SNP) sequencing, allowed the detection of duplicates and mislabeling (Mata-Quirós et al., 2017). This technology has been successfully used to study the genetic diversity of various crops, including grapevine (De Lorenzis et al., 2019), corn (Adu et al., 2019), and beans (Jiménez, 2019), among others. In the case of cacao, SNP markers have been employed to characterize and classify the resources of different germplasm banks worldwide (Mahabir et al., 2017; Mata-Quirós et al., 2017).
According to a national baseline assessment conducted by the Ministry of Agriculture (MAG), substituting around 5000 ha of hybrid plantations with an average density of 600 plants/ha with clonal plantations at 1000 plants/ha would require producing at least 5 million grafted plants (End et al., 2021; Somarriba, 2013). With the current growth and invest in commercial cacao in Nicaragua, there is a need for a national breeding strategy, as well as a traceability and certification system. This would enable relevant authorities to better coordinate the supply and demand of planting material and ensure the genetic integrity of cacao plants being produced and sold at clonal gardens and private nurseries.
The objective of this study was to evaluate the genetic resources on farms and provide information for future breeding programs, as well as to lay foundation for a national traceability and certification system.
Materials and methods
Sampling strategy and selection criteria for elite trees
The genetic material (leaves) was sampled from selected cacao trees located in different farms owned by independent farmers or members of the following associations/cooperatives: La Campesina, Flor de Pancasán, Cooperativa Agropecuaria de Servicios Extracciones Esenciales (COOPESIUNA, R.L), and others working with the non-governmental organization Catholic Relief Services (CRS). Ideally, to obtain a representative sample of the genetic variability of cacao in each visited zone, the sampling strategy should include as many individuals as possible. However, due to the high cost of the analysis, trees with special features were prioritized, including (but not limited to) high yielding trees, elderly individuals, trees showing high tolerance to pests and diseases, vigorous individuals, and trees with white seeds (locally referred to as Criollo or Acriollado). Based on these criteria, a total of 49 trees were selected by farmers across various locations in the municipalities of Matiguas (15 samples), Rancho Grande (11 samples), Siuna (10 samples), Bonanza (7 samples), Waslala (6 samples), Rosita (5 samples), Matagalpa (2 samples), Rivas (2 samples), and Nueva Segovia (2 samples).
Collection, preparation, and shipment of samples
Between 2018 and 2020, several collection expeditions were carried out in selected cacao production areas, resulting in the sampling of a total of 49 cacao trees. From each tree, a leaf in good health was collected from the third node of the productive branch, which was located up to 2 meter above ground. The leaf samples were air-dried, certified for phytosanitary controls to ensure the absence of pests and diseases, and subsequently shipped to the laboratory for DNA extraction and fingerprinting. In addition to the leaf sampling, a survey was conducted with each farmer to document the local criteria used to select and evaluate elite tree at the farm level.
DNA extraction and sequencing
DNA was extracted from each leaf sample using the DNeasy® plant Mini kit Tissue Kit from Qiagen, Inc., Valencia, CA, USA, following the supplier's instructions. The samples were stored at -80 °C until sequencing was performed using a Caspar-fluidigm SNP microfluidic probe / QPCR / fluorescence reading. The sequences of the 96 SNP markers used in this study were provided by the USDA (D. Zhang, personal correspondence, 2018). These markers were previously selected based on their balanced distribution in the ten chromosomes of T. cacao and for their high level of polymorphism (Ji et al., 2013). This set of SNP markers has been partially or completely used in other similar studies conducted in Honduras (Ji et al., 2013), Puerto Rico (Cosme et al., 2016), Jamaica (Lindo et al., 2018), Peru (Arevalo-Gardini et al., 2019), and Madagascar (Li et al., 2021).
Data analysis
To assign each sampled tree to distinct genetic group, a principal component analysis was performed using the Adegenet package in R software (R Core team, 2020), with the parameters pca = 10 and da = 10. To analyze relationships between groups Structure software (version 2.3.4) (Pritchard Lab Stanford University, 2012) was used, assigning each tree to one of ten groups previously described by (Motamayor et al., 2008) based on data from 220 genotypes of T. cacao that were sequenced using the same SNP markers. Three markers were ignored in which the expected alleles were not found (SNP_546, SNP_731, and SNP_1149). The parameters used for genetic grouping included 93 loci, a 50,000 burn-in period, and 100,000 reps with K=10. Additionally, a Bartlett test was conducted for each elite tree sample to determine its heterozygosity and consanguinity index. Tree origins were classified as pure when exhibited more than 75 % membership in a particular genetic group. Conversely, Trees with origin values below 75 % were classified as hybrids with origins from two or more groups, as per the method outlined by Lukman et al. (2014).
Results
Genetic diversity and inbreeding of elite cacao trees in Nicaragua
The heterozygosity index of the sampled elite cacao trees was found to be significantly lower than expected, as indicated by the Bartlett test results, with Ho and He values of 0.28 and 0.38, respectively. The analysis of the genetic diversity, using 93 loci, detected an average of 185 alleles, indicating that the SNP markers used were sufficiently polymorphic and appropriate for the analysis of the genetic diversity in this study. Additionally, a high consanguinity index (0.58) was observed among the elite trees, indicating a high presence of inbreeding.
The genetic relationships between sampled trees and the reference groups were revelated through Principal Component Analysis (PCA), as shown in Figure 1.
Genetic structure of elite trees
The majority of the sequenced trees did not have a pure origin and were identified as hybrids between various reference groups. Among the sampled trees, the most represented genetic groups were Amelonado (36 %), followed by Criollo (17 %) and IMC-Iquitos (15 %) (Figure 2). Other genetic groups such as Ucayali, Nacional, Nanay, and Parinari were represented in less than 15 % of the sampled trees.
Twenty-seven of the sampled trees had more than half of their genetics background from a particular group, mostly Amelonado. On the other hand, 36 trees had origins from more than three groups. For instance, the tree labeled ''M19-05W'' had origins from six different groups. Nine trees exhibited genetic purity, with more than 75 % of their genetics belonging to one specific group. These trees were ''El Bálsamo1'', ''La union-El Corozo'', ''A19-4R'', ''La Campesina7'', ''La Campesina1'', ''M19-01W'' (all from Amelonado group). In addition, two trees from La Cumplida and one tree from Mozonte municipality also showed a high level of purity (mainly Criollo) (Figure 3).
The analysis showed that approximately 50 % of the elite cacao trees had ancestral Criollo lineage, ranging from 5 % to 50 % of Criollo ancestry (Figure 4). Owners of the trees with the highest percentage of Criollo lineage described them as Criollo or Acriollado, indicating a local understanding of the genetic distinction of cultivated trees and the application of farmers' criteria for on-farm selection and breeding purposes.
White-seeded trees were found to belong to three different types of genetic composition: (1) 100 % Criollo, as seen in the samples ''La Cumplida1'', ''La Cumplida2'', and ''Mozonte''; (2) hybrids with Criollo lineage, such as samples ''A19-5B'', ''A19-4'', A19-3'' for example; and (3) trees without Criollo lineage, as defined in the Motamayor groups, including ''La Campesina4'', ''A19-1'', ''A19-2R'', ''El_Balsamo1'', and ''La Campesina3''. These trees were mostly hybrids with Amelonado or Ucayali origins (Figure 4).
Farmers' criteria for the selection of elite trees
The selection of elite trees carried out by farmers appeared to be based on three main criteria: (1) productivity, (2) resistance or tolerance to pod diseases, and (3) white bean color, which farmers considered an indicator of quality. The frequency at which each of these criteria appeared in the sampled trees is shown in Figure 5. Most elite trees were selected for their high productivity (69 %), while 23 % of the farmers selected them for their high-quality potential, which refers to the white seeds. Up to 27 % of farmers selected their trees based on apparent resistance/tolerance against moniliasis or the black pod disease. The plants selected for their productivity ranged from 3 to 40 years old, with an average age of 24 years. They produced between 50 and 280 pods per year, with an average of 123 pods per tree per year. Some farmers selected trees based on combined criteria: 2 % combined white seeds and resistance, 16 % combined resistance and productivity, and 2 % combined productivity and white seeds. None of them combined all three criteria.
Linkages between local criteria and genetic background
Elite trees which selection was based on productivity had mostly Amelonado origin (36 %) and IMC (Iquitos) origin (19 %) (Figure 6). The other groups were represented in a lower proportion (below 11 % each).
The genetic composition of trees selected based on resistance to black pod and moniliasis was very similar to that of trees selected based on productivity criteria: 47 % Amelonado and 18 % IMC-Iquitos (Figure 7).
A higher percentage of Criollo lineage (36 %) was observed in trees selected for their white seeds, with Amelonado paternity (31 %) (Figure 8).
Discussion
Cacao has been cultivated in Latin America since pre-Columbian times (~1900 B.C.). Criollos were the first to be introduced and domesticated, while Forasteros and mostly Trinitarios (hybrids between Forasteros and Criollos, originally from Trinidad and Tobago), arrived later (Motamayor et al., 2013). However, they did not take long to replace Criollo trees across Central America due to their greater adaptability to cope with biotic and abiotic stresses, as well as their higher yield potential (Cornejo et al., 2018).
There is evidence of large cacao plantations on the pacific coast of Nicaragua since before the arrival of Christopher Columbus. The country's production area extended mainly to the isthmus of Rivas, where the Nicaraguan Criollo (T. cacao) and another variety known as Lagarto cacao (T. pentagona) were cultivated, both of high quality and white seeds.
It was not until the 19th century that the first documented massive introductions of cacao from abroad began to complement the two largest commercial plantations in Nicaragua: Las Mercedes and Valle Menier, both in Nandaime, Granada. Las Mercedes imported ''Cauca'' from Colombia and ''Trinitario'' from Trinidad, while Valle Menier introduced seeds of the genetic Trinitario type mixed with local Criollo and Lagarto trees (Radell, 1971).
Several waves of introduction of genetic material occurred during the 70s, 80s, and 90s, from various producing countries. The most recent wave of massive introduction was that of 2007-2013 under the PCC project (Proyecto Cacao Centroamérica), which introduced and distributed novel genetic material from CATIE, principally Trinitarios (Phillips-Mora et al., 2012).
Previous studies using Single-Sequence Repeats (SSR) markers in Waslala municipality revelated a prevalence of Trinitarian hybrids and identified old Criollo trees. The distribution of these hybrids suggested two origins: trees near the main roads and the city of Waslala were introduced recently, possibly in the 2000s by the cooperative Cacaonica and third parties, while trees located further away could have originated from older introduction, with trees adapted to the climatic conditions of those areas (Trognitz et al., 2011). Additionally, Ji et al. (2013) analyzed fine cacao clones in Nicaragua and Honduras with SNP markers, where they showed a majority of Trinitario with Amelonado origin.
The present study is consistent with historical records, as a prevalence of Amelonados (36 %) and Criollos (17 %) was observed (Figure 2), with most trees resulting from the hybridization between these two groups. The prevalence of Amelonados among the sampled trees could be explained by the ancestral selection of cacao ''Indio'' and ''Indio Rojo'' types, linked to the ''Amelonado'' group due to their high productivity (Ji et al., 2013).
Studies conducted in the Dominican Republic and Cuba have found a similar trend, with a clear dominance of trees with Amelonado and Criollo origins. The study carried out in the Dominican Republic showed a genetic composition very similar to what was found in this study, with the same three major groups represented but in different proportions: 72.1 % Amelonado, 9.5 % Criollo, and 7.8 % Iquitos (Boza et al., 2013). In Cuba, a prevalence of Amelonado (61.6 %) and Criollo (27.3 %) was also found, and the authors stated that those trees were introduced to the island from Central America (Bidot Martínez et al., 2015). Studies conducted in other producing countries reported other genetic groups: in Ghana, Padi et al. (2015) discovered a majority of Amelonados, Nanay, and Iquitos. In Indonesia, a majority of Trinitario, Parinari, and Nanay genetic groups were observed among small cacao farms (Dinarti et al., 2015).
Currently, cacao farms in Nicaragua are populated of two types of genetic material: a) commercial clones distributed by projects or cooperatives for their superior productive characteristics (30 %), and b) trees selected by farmers mainly based on phenotypical criteria (70 %). The genetic analysis highlights that farmer mostly selected Amelonado type trees, which can be explained by their greater adaptability, resistance, and availability in the country (Ceccarelli et al., 2022; Martorell Mir, 2020).
Unlike farmers who received cacao plants from development projects or their cooperative, very few independent farmers knew the variety or genetic origin of the cacao trees they grow. This lack of knowledge may explain the poor compatibility and low productivity seen in certain farms. Knowing the genetics of the trees could help understand the best combinations and avoid incompatibility issues between clones from the same farm.
The high inbreeding coefficient (0.58) also evidences human or natural hybridization with geographically and genetically close trees. Moreover, individuals from the two most represented groups (Amelonado and Criollo) have the capacity for self-compatibility (Lanaud et al., 2017), which increases the consanguinity in the offspring.
In this study, it was found that half of the trees described by farmers as elite trees with white seeds did not have a Criollo origin in their genetic background. Consequently, having white seeds does not necessarily mean that the tree is a native or original ''Criollo'' tree, as defined by Motamayor et al. (2008). Other groups, such as the Amelonado, may also have white or slightly purple seeds, due to their origin or local hybridization.
Many trees labeled as elite trees had genetic linkages with well-known clones (i.e IMC 67, UF, ICS; data not shown) widely distributed in Nicaragua during the 80s and 90s. Therefore, it is very likely that they come from open pollination of commercial clones, whose seeds were harvested and planted (Mata-Quiros et al., 2017). This process of on-farm selection seeks ''genetic improvement'' based on production features but leads to uncertainty regarding the compatibility of selected trees with their neighbors (Ayestas et al., 2013; López et al., 2021).
The genetic analysis of cacao trees selected by farmers for different criteria such as productivity, disease resistance, and white seeds revelated that Amelonado trees were mostly chosen for their productivity and resistance to diseases. This indicates that farmers selected trees that fulfilled both criteria simultaneously. The presence of Criollo tress of between 6 to 10 % in these productive and diseases-resistant trees is likely due to the existence of native Criollo trees in the area, whose origin was mixed through local selection and multiplication over time. The trees chosen for their white seeds were mainly of Criollo and Amelonado origin, owing to the strong presence of these groups in Nicaragua.
It is worth noting that none of the tree sampled in this study combined all three selection criteria, i.e., having white seeds, being resistant to diseases, and being highly productive. This underscores a gap in the variety of cacao offered and emphasizes the need for further breeding efforts to enrich the genetic pool of local farms.
DNA analysis is essential for determining the origin or reliability of traded plants and serves as the foundation for an efficient, effective, and reliable traceability system (End et al., 2021). This type of analysis enables the identification of genetics variations that cannot be observed through visual or physical analyses of cacao plants. Such variations can impact the selection and breeding pipeline (Motamayor et al., 2013).
The set of 93 SNPs tested in this study was used to characterize and classify the sampled trees, creating a unique genetic fingerprint for each planted material. This set of markets was found to be suitable for certification and traceability purposes. The genetic make-up suggested that the number of SNPs markers could be reduced while achieving similar results, thereby reducing operational costs. For instance, Ji et al. (2013) reported that a smaller number of markers (26 SNPs) could provide 99.99 % confidence in identifying a unique cacao tree.
The knowledge of the region's genetic diversity serves as baseline for the conservation and use of on-farm T. cacao germplasm (Ceccarelli et al., 2022). It is crucial to maintain a high genetic diversity to have various sources of resistance to pests and diseases, as well as quality attributes, among others factors (Padi et al., 2015).
Currently, the clones traded in Nicaragua mostly belong to the Amelonado genetic group, with a limited diversity. The present study did not identify significant proportions of germplasms from the Guiana, LCT-EEN, Nanay, Parinari, and Purus groups. Importing cacao from these genetic groups, such as from the international collections of the Cocoa Research Centre of Trinidad and Tobago and/or CATIE, could expand the genetic variety of cacao trees in Nicaragua and introduce genes of interest, such as resistance to pests and diseases or drought tolerance, into the Nicaraguan cacao trees gene pool. Clones from the Nanay, Parinari, and IMC groups were highlighted as sources of resistance to pests and diseases (Lukman et al., 2014).
Several trees with Criollo lineage could be part of the improvement plan to take advantage of genes of interest for organoleptic quality adapted to local climate. Trees such as ''La_Cumplida1'', ''La_Cumplida2'', ''Mozonte'', ''A19-4'', as shown in Figure 3, could play that role. However, a quality assessment of the cacao liquor from these trees is necessary.
Two samples, ''M19-08S'' and ''M19-09S'', exhibited a genetic profile that was markedly different from the others, forming a distinct cluster (data not shown). These trees were sourced from Siuna, Floripon, and Rancho Alegre communities, and were described by farmers as ''Cacao Mono'' and ''Native cacao that does not form horquet''. Given their description and genetics, they may belong to the same family as Theobroma cacao (Malvaceae), but to a different species such as Herrania purpurea. H. purpurea produces pods that are very similar to those of T. cacao but are smaller and produce a bitter drink (Belt, 2003).
Other trees selected by the farmers comprised various ancestries such as ''A19-1B'', ''M19-05W'', and ''La_Campesina5'', among others. These well-adapted trees to local conditions might be useful for a national selection and breeding program. However, further assessment of these clones in terms of vegetative growth, yield potential, compatibility, resistance to pests and diseases, adaptation to the environment, and organoleptic quality is recommended.
Conclusion
The genetic analysis carried out in this study revealed the unique genetic make-up of the Nicaraguan cacao, which explains its exclusive profile. Additionally, the analysis exposed a low diversity of local cacao and a lack of knowledge regarding the genetic origin of the clones. This emphasizes the need for a certification and traceability system for cacao planting material in Nicaragua. The SNP fingerprinting of several local trees provided the necessary foundation to implement this traceability and certification system. Furthemore, the genetic study provided insights for a national breeding program. Several genetic materials that could integrate a local breeding program were identified, allowing for a better selection of clones that are more adapted to local conditions in the future. By having greater control over the compatibility between clones, higher yield can be achieved in the medium term.