Abstract:
Moringa oleifera Lam (syn. M. Pterygosperma Gaertn) is one of the most economically important crops widely cultivated globally. Traditionally, it has been used as a remedy for conditions such as diabetes, inflammation, cancer, bacterial, viral, and fungal infections, joint pain and heart disease. Moringa oleifera has also been utilized for human consumption, as forage and for water purification. Considering it is a cross pollinated plant, M. oleifera has huge variability. Due to the fast-growing nature and adaptability to variable climatic conditions, M. oleifera has been popularly cultivated in Coastal and Eastern Kenya. Studying the genetic diversity of the drumstick tree is essential in selecting valuable genotypes and for improvement of its cultivars for improved breeding programs. The use of marker-assisted selection of cultivar for the desired traits will benefit faster breeding programs. Single nucleotide polymorphisms (SNPs), being the most recent molecular markers, were used in this study. Seventeen M. oleifera provenances from Coastal Kenya were analyzed by genotyping by sequencing (GBS). Briefly, M. oleifera leaves were homogenized, crushed and DNA extracted, genome complexity reduction and library preparation was carried out followed by barcoding, electrophoresis, pooling, preparation of sequencing libraries and sequencing. Next generation sequencing (NGS) technologies, Illumina Hiseq 2500, was used for sequencing and SNP mining. Binning was carried out to remove noise. In silico assembly of sequence contigs was undertaken. Data analysis was undertaken using DArT R and KDCompute Platforms. Genetic characterization was achieved by carrying out cluster analysis, principle coordinate analysis (PCoA), 3D plot and phylogenetic tree. One hundred and sixty four (164) genotypes were identified. It was evident from the analysis undertaken that the genotypes consistently clustered into four clades/groups. However, the clusters obtained by each analysis tool comprised different genotypes due to the assumptions employed by each tool. The similarity coefficient from Hierarchical clustering ranged from 63% to 100% indicating that the genotypes had low variability. An interactive and highly robust 3D plot was produced using DArT R (Adegenet package). Interestingly, Pwani University genotypes clustered separately. It is noteworthy that these genotypes consistently clustered uniquely and independently irrespective of the tool used. This is possibly due to the seeds being of unique origin. Nevertheless, the high similarity between the genotypes could be attributed to the M. oleifera plants in the various provenances having the same ancestry. Given the high frequency of SNPs and their involvement as a source of allele variations, this research could contribute to the discovery of associations between gene allelic forms and phenotypes, enabling the alleles to be linked to desirable traits (fast growth and high seed production).