DNA Sequencing of TRK

DNA sequencing of TRK Genes is needed to identify alterations as well as simple mutations in the TRK genes and their transcripts. The history of DNA (and mRNA) sequencing, from Maxam-Gilbert and Sanger sequencing to modern next generation techniques has been reviewed elsewhere. In the old days of DNA sequencing, one person read the gel out loud while the other person recorded the sequence. Sometimes those little bands were hard to distinguish from one another.

Sequencing of target transcripts

 

DNA sequencing picture of TPM3-TRK1 and ETV6-TRK3 gene rearrangements in thyroid carcinoma

Figure 1 Progression of methods used to identify TRK gene rearrangements.

In the Beimfohr  1999 thyroid study, RNA was extracted from the tissue and converted to DNA by a process called reverse transcriptase PCR, or rtPCR. These investigators had prior knowledge of what kind of TRK gene rearrangements might be present in these thyroid carcinomas in school children exposed to radiation in the Chernobyl accident. PCR primers were chosen accordingly. Direct sequencing of the purified PCR products was performed by the Sanger dideoxy method, circa 1977. The 2014 Leeman-Neill thyroid carcinoma study utilized FFPE preserved papillary thyroid carcinomas from Chernobyl survivors collected between 1998 and 2008. The extracted mRNA was converted to cDNA using rtPCR. Sequencing of the PCR amplicons was performed on an Illumina HiSeq2000.

Beimfohr C, Klugbauer S, Demidchik EP, Lengfelder E, Rabes HM.(1999) NTRK1 re-arrangement in papillary thyroid carcinomas of children after the Chernobyl reactor accident. Int J Cancer.80(6):842-7. PubMed

Leeman-Neill RJ, Kelly LM, Liu P, Brenner AV, Little MP, Bogdanova TI, Evdokimova VN, Hatch M, Zurnadzy LY, Nikiforova MN, Yue NJ, Zhang M, Mabuchi K, Tronko MD, Nikiforov YE.(2014) ETV6-NTRK3 is a common chromosomal rearrangement in radiation-associated thyroid cancer. Cancer. 120(6):799-807. PubMed

NGS of cancer associated mRNA transcripts to identify “actionable targets”

These approaches are very well and good, but there is a new approach in which one does not have to have prior knowledge of both fusion partners. A two step TRK gene rearrangement test has been developed. Once the tumor has been confirmed to stain strongly with panTrk antibody, 113 primers spanning regions on 15 specific genes of interest are used.  Granted, some prior knowledge is required to  cast this broader net that is based on the description of the gene rearrangements in the peer reviewed literature.  Once the presence of a  cancer driving Trk fusion protein has been verified, a patient may be entered in a Trk treatment clinical trial.

5′-RACE to identify novel Trk fusion protein transcripts

Sometimes one only needs to know half of the fusion. Such was the case for Wang and coworkers (2016) who identified some common as well as a novel TRK3 gene rearrangement in childhood melanocytic neoplasms. They knew what TRK sequences they were looking for on the 3’ end of the transcript. As part of the rtPCR they added known sequences to the 5’ end of the cDNA. PCR primers for sequencing purposes were against this artificial 5′ end and 3′ target regions of the TRK3 transcript.  A transcript for a fusion protein with an N-terminal myosin 5 motor fused to the kinase domain of TrkC at its C-terminal was discovered using this process called  5′-RACE.

Wang L, Busam KJ, Benayed R, Cimera R, Wang J, Denley R, Rao M, Aryeequaye R, Mullaney K, Cao L, Ladanyi M, Hameed M.(2017) Identification of NTRK3 Fusions in Childhood Melanocytic Neoplasms. J Mol Diagn.19(3):387-396. PubMed. PMID: 28433076

Discovering gene rearrangements by whole genome sequencing

It would seem that one needs to know what one is looking for NGS results to really make sense.  A large international group sequenced the complete genome of 560 triple negative breast cancers. They published the chromosome positions of  gene rearrangements that occurred in non protein coding  introns in their supplemental data. We looked for the ETV6-NTRK3 gene fusion that has been reported in a form of 3x negative breast sancer, secretory carcinoma of he breast (Tognon 2002). Numerous ETV6 gene rearrangements were reported in the 560 3x negative breast cancer publication. The ETV6 gene codes for the TEL transcription factor. The ERC1-NTRK3 was “recurrent.” Both of these genes are on chromosome 12.

Picture showing location of genes ERC1 and ETV6 on chromosome 12.

Figure 2. The location of  ERC1 and ETV6 on chromosome 12.

The product of ERC1 gene is also known by its protein name ELKS. It has been found fused to the kinase domain of the RET receptor tyrosine kinase (Liu 2005).

The supplemental data of the 560 3x negative breast cancer study also contained reference to an intronic ARNT2-NTRK3 gene fusion.

Picture showing location of genes ARNT and TRK3 on chromosome 12.

Figure 3. The location of ARNT and TRK3 on chromosome 15.

Some natural questions are

  1. Can a fused intervening region (intron) be spliced out of an mRNA transcript?
  2. If proper splicing cannot occur, is there a nonsensical “frameshift” region of the translated protein  between properly spliced exons ?
  3. Regardless of fusion intron splicing, are the protein coding parts (exons) in the same reading frame in the transcript?

In a triple BTBD1-CEBP-TRK3 gene fusion we have reviewed on this site the CEBP1 information gets spliced out of the mRNA transcript. What’s more, the fusion can drive cancer.

Tognon C, Knezevich SR, Huntsman D, Roskelley CD, Melnyk N, Mathers JA, Becker L, Carneiro F, MacPherson N, Horsman D, Poremba C, Sorensen PH. (2002) Expression of the ETV6-NTRK3 gene fusion as a primary event in human secretory breast carcinoma. Cancer Cell. 2(5):367-76. PubMed 

Liu RT, Chou FF, Wang CH, Lin CL, Chao FP, Chung JC, Huang CC, Wang PW, Cheng JT. (2005) Low prevalence of RET rearrangements (RET/PTC1, RET/PTC2, RET/PTC3, and ELKS-RET) in sporadic papillary thyroid carcinomas in Taiwan Chinese. Thyroid. 15(4):326-35. PubMed

Single amino acid mutations and NGS

The Catalog of Somatic Mutations in Cancer COSMIC has a fantastic database of mutations found in cancer that are hard to interpret unless one is really looking for something. Some of these mutations have to be viewed in light of natural variation in amino acid substitutions in the Trk kinases in various species of vertebrates. Each of the three Trk kinases from five representative species of vertebrates were aligned using ClustalW. The variations that have been tolerated over the course of evolution gives some insight on whether a somatic mutation can be tolerated by the patient. On the other hand, some amino acids in the Trk isoforms are absolutely conserved in our five representative vertebrates.  This is particularly true of the kinase domain, of concern to  the oncologist with a patient whose tumor is driven by a TRK gene rearrangement.

“ * ” absolutely the same

“ : “ conservative substitute, functionally probably the same

“ . ” a little bit the same.

Trk A

The following sequences of TrkA were obtained from NCBI.  TrkA mutations were obtained from COSMIC.

TrkA human BAA34355.1

TrkA mouse NP_001028296.1

TrkA chicken NP_001028296.1

TrkA frog XP_002939035.1

TrkA fish NP_001288285.1

 

ClustalW DNA sequence alignment of TrkA gene in Xenopus, zebrafish, human, mouse and chicken.

Figure 4 TrkA kinase domains of five vertebrate species aligned with ClustalW. The kinase domain proper starts at amino acid 510.

 

There were not any mutations in or adjacent to  autophosphorylation sites (Fig. 10). In TrkA, autophosphorylation sites are Y496, Y676, Y680, Y681, Y791.

Figure A showing post translational modification along with enzyme activity regions in TrkA. Figure B shows point mutations and synonymous substitutions in frequency distribution profile. Figure C shows point mutations that might impact function of TrkA.

Figure 5 distribution of TrkA mutations associated with cancer (COSMIC) A. cartoon of TrkA domains and sites of post translational modification and enzyme function B. frequency distribution of all cancer associated point mutations including synonymous substitutions. C. frequency distribution of only point mutations that might impact function.

 

COSMIC offers two ways of viewing point mutations: as potentially pathogen or as simple changes in the nucleotide sequence that may result in the same amino acid after the transcript is translated into protein (panel B). Nonsynonymous substitutions are shown in the histogram in panel C. Note the comparatively high frequency of single nucleotide substitution in the highly conserved kinase domain.

TrkB

We are going to follow the same format as TrkA. The following sequences were used for the ClustalW alignment.

TrkB human Accession: AAB33109.1

TrkB mouse NP_001269890.1

TrkB chicken Accession: NP_990562.1

TrkB xenopus Accession: NP_001079579.1

TrkB zebrafish NP_001184090.2

ClustalW DNA alignment of TrkB kinase domain in human, mouse, chicken, frog and zebrafish.

Figure 6 The TrkB kinase domains  of five vertebrate species aligned with ClustalW.  The arrows mark the boundaries of the kinase domain proper.

 

Note that the TrkB kinase domain is also highly conserved throughout vertebrate evolution.

Figure A showing post translational modification along with enzyme activity regions in TrkB. Figure B shows point mutations and synonymous substitutions in frequency distribution profile. Figure C shows point mutations that might impact function of TrkB.

Figure 7 distribution of TrkB mutations associated with cancer (COSMIC) A. cartoon of TrkB domains and sites of post translational modification and enzyme function B. frequency distribution of all cancer associated point mutations including synonymous substitutions. C. frequency distribution of only point mutations that might impact function.

 

TrkC

The sequences used for TrkC are the following:

TrkC human NP_001012338.1

Trk C mouse NP_001028296.1

TrkC chicken NP_990500.1

TrkC Xenopus XP_017947960.1

TrkC killfish A0A1A8KFW1

ClustalW DNA alignment of TrkC kinase domain in human, chicken, frog, Killfish and mouse.

Figure 8 The TrkC kinase domain  of five vertebrate species aligned with ClustalW.

 

TrkC autophosphorylation sites areY516, Y705, Y709, and Y710.  Like the other isoforms ofTrk, the kinase domain of TrkC.  We could not find a sequence of zebrafish TrkC no a sequence of killfish TrkC was used instead.  Splicing variations are particularly common in the TrkB and TrkC isoforms.  What is considered the canonical seuqence may differ from one species to another.

Figure A showing post translational modification along with enzyme activity regions in TrkC. Figure B shows point mutations and synonymous substitutions in frequency distribution profile. Figure C shows point mutations that might impact function of TrkC.

Figure 9 distribution of TrkC mutations associated with cancer (COSMIC) A. cartoon of TrkC domains and sites of post translational modification and enzyme function B. frequency distribution of all cancer associated point mutations including synonymous substitutions. C. frequency distribution of only point mutations that might impact function.

 

Single amino acid substitutions, do they mean anything?

Perhaps it will help to take a closer look at point mutations in the three Trks in the autophosphorylation region.  Only the side chains of amino acids that might elicit the biggest change in structure are shown for clarity.  The single side chain of  hydrogen glycine (G) is not shown at all when it is attached to a carbon.  In this method of drawing structures, carbons are to be assumed anywhere there is a kink in the stick structure  Arginine (R) in wild type TrkA is not only charged but also bulky.

 

 

Picture showing types of amino acid substitutions at the autophosphorylation domains of TrkA, B and C.

Figure 10. Amino acid “substitutions” are shown for the autophosphorylation domains of TrkA,B, and C. The side chains of a select substitution for each isoform are shown.

 

In TrkB substituting a leucine (L) for an arginine (R) could conceivably result in substantial conformational changes.   The hydrogens on nitrogens in the arginine side shown are drawn to emphasize the polar, water loving nature of arginine.  The leucine side chain is hydrophobic.   Finally, in TrkC, a substitution of a small, hydrophilic serine (S) with a large, hydrophobic phenylanaline (F) might also result in a conformational change.  The hydrogen on the oxydgen of the serine side chain is drawn to emphasize the polar, hydophilic nature of this side chain.  Do any of these single base pair substitutions drive the cancer they were associated?  We do not know.  Many of the single nucleotide substitutions resulted in the same amino acid in the translated protein.  If CIPA associated mutations in TrkA are any indication, most mutations are inactivating.

Activation loop mutations are expanded in greater detail in a post on this site.

The juxtamebrane region, between the transmembrane domain and the kinase domain, appear to be a source of potentially important mutations.

Of course we have kinase domain mutations.  Nonsense kinase domain mutations are something to think about.

Point mutations in Trk genes in colorectal cancer has been enabled by NGS.

We have stressed the greater importance of gene rearrangements compared to point mutations on this site.  We also report on an NACC2-NTRK3 gene rearrangement in astroctyoma. This rearrangement was found NGS style, on COSMIC.