A large international team of scientists has sequenced and analyzed the 640 million base pair genome of Eucalyptus grandis (known as the Flooded Gum or Rose Gum).

Eucalyptus grandis. Image credit: HelloMojo.
Trees play a significant role in the global carbon cycle. Collectively, they represent a major terrestrial repository of carbon and play both active carbon dioxide capture and processing and passive storage roles.
With these advantages in mind, Eucalyptus can be harvested from tropical and temperate zones and has over 700 species that are rich in genetic variation.
About 80 percent of the woody biomass in a Eucalyptus is made of cellulose and hemicellulose, both long chains of sugars, with the remaining biomass primarily comprised of lignin, the tough ‘glue’ that holds it all together.
“Eucalypts are now the hardwood plantation species of choice in many parts of the world for applications like paper making and bio-energy,” said Dr Antanas Spokevicius from the University of Melbourne, a co-author of the paper published in the journal Nature.
“This resource will provide a huge boost for breeding and biotechnological tree improvement programs and has put us on the same foot as many other important crop species whose improvement programs have benefited greatly from a sequenced genome.”
“Efforts to sequence the genome of a eucalypt started over a decade ago. There have been a number of international workshops, meetings and other exchanges that have brought the international eucalypt research community together to discuss, and now create, the resources to unlock the potential of eucalypts as a truly global fuel and fiber source,” Dr Spokevicius said.
“The genetic code will help us understand a foundation species for the Australian eco-system and how it affects other species, from fungi through to the koala,” said co-author Dr Carsten Kulheim from the Australian National University.
“It will give scientists the tools to know what plants a koala will feed on and not feed on, which helps with measures to preserve koala habitat.”

Eucalyptus grandis genome features in 1-Mb intervals across the 11 chromosomes, units on the circumference show megabase values and chromosomes: a – gene density, number per Mb, range 6–131; b – repeat coverage; c – average expression state, fragments per kilobase of exon per million sequences mapped; d – heterozygosity in inbred siblings; e – telomeric repeats; f – tandem duplication density; g, h – single nucleotide polymorphisms identified by resequencing BRASUZ1 in 1-Mb bins, g, and per gene, h. Central blue lines connect gene pairs from the most recent whole-genome duplication event. Image credit: Alexander A. Myburg et al.
The sequence of the Eucalyptus grandis genome consists of 640 million base pairs of DNA, containing over 36,000 genes – almost double the number of genes in the human genome.
The scientists identified genes encoding 18 final enzymatic steps for the production of cellulose and the hemicellulose xylan, both cell wall carbohydrates that can be used for biofuel production.
Their results revealed an ancient whole-genome duplication event estimated to have occurred about 110 million years ago, as well as an unusually high proportion of genes in tandem duplicate arrays.
An additional finding by the researchers was that among sequenced plants to date, Eucalyptus showed the highest diversity of genes for specialized metabolites such as terpenes.
They identified 113 genes responsible for synthesizing terpenes. These hydrocarbons serve as chemical self-defenses against pests, as well as providing the familiar aromatic essential oils used in both medicinal cough drops and for industrial processes.
______
Alexander A. Myburg et al. The genome of Eucalyptus grandis. Nature, published online June 11, 2014; doi: 10.1038/nature13308