moviePhenCards tutorial

This is the tutorial page for PhenCards. For ease of learning and accessibility, the video is broken down into segments and the transcript is listed for each segment on this page.

This is the start of the video. If you want to learn it all at once, feel free to watch the whole video. If you want to skip around, you can use the sections below. The sections are labeled with their timestamps on YouTube below their embedded videos. You can also search using the text of the transcript below each video. The embedded videos are paused at each respective timestamp for the transcript below it.

0:00 [Intro]

Welcome to PhenCards. PhenCards is a web server that links human phenotype information to biomedical knowledge. There are two ways to use PhenCards. First and foremost, you can search by phenotype term directly, whether that term be a trait, symptom or disease. If you start typing a term, for example, “cleft palate,” after the first three characters the autocompletion will suggest potential terms for you. The second way, which I will demonstrate first, is by submitting patient notes or clinical free text. The example in this text box here is from a real deidentified patient and the paper is cited below.

0:38 [Patient: Extracting Terms]

Now that we have our patient notes, we can click “Search” here at the bottom. It will take just a moment to parse these notes using a custom version of Doc2HPO hosted on the Wang Lab servers. You’ll see that the note text is now in the gray box and some terms are highlighted red and some are highlighted in green. The green text indicates accepted HPO terms that were extracted by Doc2HPO. The red text indicates extracted terms that are negated because the text had negative verbs or words like “did not” so they are not included in the final result.
The table below has all of the extracted terms and their respective HPO IDs. If you click on the HPO IDs, it will take you to the HPO website where you can find out more details about the term there. Alternatively, we would prefer you click on the term itself, which will search PhenCards using the Phenotype term search I mentioned earlier.

1:34 [Patient: Disease Information]

Under Diseases, you can see which diseases are predicted to be linked to the HPO terms from the notes. This algorithm uses the diseases linked to HPO terms from the HPO database and ranks them by the combined elasticsearch score of those terms. You can see here that the number one ranked disease is Aarskog-Scott syndrome, which is in fact the causal disease for the patient’s condition in the cited paper. If you click on the disease name, it will take you to the HPO database link for the disease and from there you can click the links out to Orphanet and OMIM.
So in OMIM here you can see that the causal gene is FGD1. Likewise, if you go back and click on Orphanet, you scroll down to the Genes section, and here you can also see that FGD1 is the causal gene for Aarskog-Scott syndrome.

2:33 [Patient: Genes]

If you look under the Genes section, you can see the Phen2Gene results for the extracted HPO terms. Phen2Gene is a tool from the Wang Lab that ranks candidate genes for a phenotype based on gene-disease and gene-gene information linked to HPO terms. The third ranked gene here, FGD1, is actually the causal gene in the paper for this patient phenotype, and we saw it on OMIM and Orphanet earlier as well. If you click on the gene name to learn more about it, it links out to MedlinePlus, which is a free service for health information from the National Library of Medicine.

3:08 [Patient: Clinical Trials]

You can view Clinical Trials data from Clinical The search involves an automated “OR” query using the HPO terms, as many symptoms are treated individually. For example the first result here is a clinical trial for frenotomies to repair the condition of ankyloglossia where a segment of tissue ties the tongue’s tip to the floor of the mouth. And this hasn’t even started recruiting yet.

3:35 [Patient: Literature]

Lastly, you can perform a literature search using the HPO terms via Google Scholar. You can remove the “OR” operators to make the search even more specific, or you can add more terms. If you remove enough of the “ORs,” the search becomes so specific the only result returned by Google Scholar is the paper we got this clinical text from. And that is all for the patient section.

3:57 [Term Results: Navigation]

Now, let’s get into the primary function of PhenCards, the phenotype term search. Let’s use the default term used in the paper, craniosynostosis. A search takes just a moment. You’ll notice right away that the navbar lets you jump around to a lot of different sections with smooth scrolling, so you can see where you’re going on the site, and the highlighted section changes automatically as you scroll through the site. Next, you should notice at the top there is some helpful information telling you how to navigate this results page. One key feature is these hoverable tooltips that you can use to get more information about headers of a table, or information contained within them. At any time, if you want to go back and submit a new term or clinical notes, you can merely click the PhenCards logo at the top left.

4:44 [Term Results: Aliases]

The Aliases section contains likely matches for your search term across several different databases. You do not have to accept an autocomplete query when searching PhenCards, and you may want to just copy and paste a term. With Elasticsearch built into much of the site, you can find the most likely matches for your query in this section, and maybe even use that information to refocus your search.
The first example we have here is results for craniosynostosis that come from the HPO database. If you look at the header, you can find some helpful tooltip examples. This is the HPO term, the HPO ID, the list of alternate IDs, child term IDs, parent term IDs, linked external database IDs, and then lastly but not leastly, the Elasticsearch score for the query term. Basically, the higher it is, the more likely it is what you are looking for. It is based on an ngram-based fuzzy matching algorithm. It is pretty accurate for the most part, but there are some edge cases, of course, to every search.
There are also some field values that have their own tooltips or links. For example, the entry for craniosynostosis has a curated description field and we have used that for the tooltip so you can better understand the term. The HPO ID here links out to the HPO website for more information on the term. You can do likewise for the ICD-10, OHDSI, and MeSH term IDs as well.

6:27 [Term Results: UMLS]

We are currently, as of the making of this video, the only site that allows for the searching of UMLS terms with one critical caveat: you need a UMLS Terminology Services license, or UTS license, from the National Library of Medicine, which is not really difficult to get if you are a researcher. After putting in your UTS credentials, we use HTTPS to securely pass them to the NLM directly without saving them, and only obtain a boolean value of True or False from them which we use to determine if you are authenticated correctly or not. If your information is false, you will be kicked back to the Results page. If your information is correct, you will actually be able to view the results. >br> Here I’m going to use my actual UTS ID and password so I can properly authenticate. You can choose to save this information in your password manager if you so wish on your own end, so you do not need to type it every time. We obviously do not save any of that information on our side. You can also search for your database of choice like NCI, or MedlinePlus, using the DataTables search function, which also works on any other table on the site.

7:39 [Term Results: Related Terms]

The Columbia Open Health Data resource is extremely useful for linking co-occurring terms in Columbia Med patient notes to a matching OMOP or OHDSI search term for your query. Let’s click on the OMOP ID for Craniosynostosis syndrome here. At the top, you’ll note that this uses the temporal beta dataset which currently has patient notes across several years for about 4.5 million unique patients. The concept count shown here is based on occurring once per patient, not multiple times. First, these are the ancestor terms for the search term. They are sorted by concept count, which effectively amounts them to being sorted by the level of separation.
So the lowest level will have our actual term, craniosynostosis syndrome, then cranial suture finding, anomaly of joint, head, bone, et cetera.
The more interesting part of this data, however, is looking at the co-occurring conditions, drugs, and procedures in Columbia Med patients for your OMOP search term. These are sorted by most significant chi-square value, and the calculation is explained here in the tooltip in the header. You can see that craniosynostosis co-occurs with acrocephalosyndactyly and Crouzon syndrome, both of which have craniosynostosis as a symptom or sub-phenotype. If you look at the co-occurring drug data, you will notice the biggest hits are acetaminophen and lidocaine for pain, for example, and vitamin and glycerin supplements. Perhaps most interesting for this particular condition are the co-occurring procedures, like this detailed reconstruction that includes the orbital rims and forehead, or this extensive craniectomy.

9:31 [Term Results: Diseases]

The search brings up Disease aliases as well. For the Disease Ontology database, we have a good example of when there was no clear match, but fuzzy search still gave us some results. Obviously crab allergy is not right, but at least osteopathia striata with cranial sclerosis is in the same ballpark. You have to use your best judgment for more obscure terms in situations like these, but generally if the score is above 80, it’s probably a pretty near-exact match. You can also look at the disease databases linked to the HPO terms as in the patient notes section of the site from earlier. This algorithm is a bit different in that it uses the linked HPO term and the disease name to search the linked databases. One downside of this is that in cases like craniosynostosis or cleft palate where the HPO term name is also the disease name, this search favors those results more strongly, as you can see here. But if you know you’re looking for a more specific disease, like Crouzon syndrome for instance, or Beare-Stevenson, you may want to ignore those two-hit results. You can click on the database ID, as in the patient section earlier, to take you to HPO and then to OMIM, or Orphanet, or wherever, and you can find the alternate names and causal genes that are linked to these diseases and conditions.

10:57 [Term Results: Genes]

Speaking of genes, the Genes section uses Phen2Gene from the Wang Lab to prioritize the top 1000 potential causal candidate genes for the top HPO term result. If there are no HPO term results (although we have craniosynostosis above), this section will return no results. You can see that the number one ranked gene here is FGFR2 which is also the typical gene where causal variants lie in Orphanet and OMIM for Crouzon syndrome (and several other syndromes related to craniosynostosis).
I can show you an example here with Crouzon syndrome. If you click on the database ID for OMIM, you can see that FGFR2, which is the number one result for Phen2Gene here, is the causal gene for Crouzon syndrome and also for Beare Stevenson cutis gyrata syndrome.

11:57 [Term Results: Pathway]

Pathway data is available from Pathway Commons and KEGG for the search term. If you click on Pathway Commons a new tab opens and you can see the results. The first result here is Activated point mutants of FGFR2, which is the same causal gene you may remember from earlier. You can see the number of participants in the pathway and the number of subprocesses, as well as links to the ancestral pathways. If you click on the pathway name, it will take you to the site from where Pathway Commons obtained the data, which is usually Reactome. In Reactome, this pathway is mentioned in reference to Crouzon syndrome and Beare-Stevensen cutis gyrata syndrome and Apert syndrome, all of which have craniosynostosis as a phenotypic trait. You also can click on the pathway image and explore it further as you like.
As for the KEGG section, I wrote a small script that takes a moment to search KEGG for disease hits and then searches KEGG for pathways linked to those disease hits. Most of the time the results are pretty broad, like for the KEGG term “Craniosynostosis and dental anomalies” it returns the very general Cytokine and JAK/STAT pathways. That could contain some useful information, of course, but generally we recommend using Pathway Commons, which also contains some KEGG data.

13:18 [Term Results: Drugs]

For Drugs results, we have the drugs returned when searching the company databases for APEx Bio and Tocris, which just links out to their sites. The main function of the Drug section however, is the openFDA FAERS or FDA Adverse Event Reporting System database. This first table displays the most prominent reaction hits for your term query, the most prominent of which is craniosynostosis here. This is good because it means most of the data below will be related to patients who experience craniosynostosis as a reaction to drugs. Next, we will see the drugs that are related to, and likely causal, for the reactions in the first table. The first 3 results are all SSRIs, or antidepressants, that are known to cause craniosynostosis in neonates when taken by pregnant women. Most commonly this is taken in tablet form, so just your everyday antidepressant medication. The next few tables confirm the idea that these reactions are mostly from neonates as most patients with the reaction are mostly around 3 kg in weight, and the only ones with these reactions are neonates, infants, and children. The outcomes are mostly unknown since this is a pretty expensive and complex condition, but for determinate outcomes these are largely considered to be resolved or recovered.
The next part of this data is a bit more sparse but is related to drugs that are given to patients who have this indication in openFDA, for example, tranexamic acid for bleeding post surgery, or fentanyl or oxycodone for pain. Then there are reactions to these prescribed drugs such as artery occlusion, and by which routes of administration they are typically given.

15:03 [Term Results: Literature]

Our Literature search here uses PubMed as opposed to Google Scholar like we used in the patient notes section. This algorithm is a bit more complicated and takes a little while longer to run because it takes the term, gets the top 200 PubMed results using the Entrez ESearch Best Match algorithm using MeSH terms, and then grabs the top 25 most cited results from those using Entrez EFetch. So it’s a little bit limited by API speed. You can click on the PubMed IDs to take you directly to the articles. These papers are not necessarily old, many are newer, because of the Best Match search component, but are generally well-cited and relevant.

15:46 [Term Results: Foundations and Grants]

This is an interesting component of the site that I am pretty proud of. This section involves Foundations and Grants related to your phenotype term search. The first set of data comes from the IRS, or the Internal Revenue Service, and returns non-profits or tax form 990 filers related to the phenotype term. This data is totally free and public domain. Their database is on AWS with a README explaining it, we have downloaded the database and use Elasticsearch to extract relevant foundations on the backend. You can see here the first results are the Craniofacial Foundation of America and the Children’s Craniofacial Association, both of which make sense, and if you click on the URL ID, you can go to the public XML page and find out where this place is, who runs it, what the phone numbers are, and so forth, if you or your patient potentially need these resources. You can even search the employer identification number, or EIN, on Google to learn more about the company.
We also have parsed the Open990 foundations and grants databases, which is essentially data parsed from these public XML files I mentioned earlier from the IRS. The data is not all there or always well parsed, but it can sometimes be useful and likely preferable to reading XML file format. One benefit is looking at grants that were given to people and how much money those grants were, which is also contained in the IRS XML files.
Active Funding Opportunity Announcements from the NIH are also available here if you are interested in trying to write a grant centered around this condition. If you click on the first link, you can see craniosynostosis is there in the FOA, which explains its presence in the results. We even list the expiration and application dates, and funding sponsors.
Lastly, we provide data from NIH’s Federal Reporter service which consists of actively funded grants from all government organizations including the NIH and the NSF. You can click the link to learn more details and we provide the agency, the PI, cost of funding and where the PI is located in case you have an interest in collaborating on potential bleeding edge research for your patients.

18:02 [Term Results: Clinical Trials]

Just like the patient notes section, the phenotype term search provides relevant clinical trial data from for the query. You can click on the links like before and learn more about each clinical trial. A good example of a trial that may be useful for a patient with craniosynostosis is the endoscopic strip craniectomy trial here which is an actual procedure, as opposed to the fMRI studies or the surveys listed above.

18:29 [Term Results: Citations and Licenses and Closing]

Finally, here at the bottom you can find all of the different citations for the resources we have used, their licenses, and links out to the sites where relevant. Hopefully this answers all questions you have about the site. Feel free to contact us with any feedback or questions if you desire. And thank you very much for your time.

accessibility_newBrowser Compatibility

OS Version Chrome Firefox Microsoft Edge Safari
Linux Ubuntu 20.10 87.0.4280.66 81.0.2 n/a n/a
MacOS Big Sur 87.0.4280.66 81.0.2 n/a 14.0
Windows 10 20H2 87.0.4280.66 81.0.2 44.17763.831.0 n/a
iOS 14.2 86.0.4240.93 n/a n/a 14.0
Android 10#G965USQUFTJ3 86.0.4240.198 n/a n/a n/a