Celiac+Project

Stanford URJ article on Celiac Inhibitors
@http://www.stanford.edu/group/journal/cgi-bin/wordpress/wp-content/uploads/2012/09/Cho_NatSci_2005.pdf

To make up for class attendance, I am working on this project regarding Celiac Disease. The goal is to complete background research on Celiac Disease: to understand its cause and current research that has been done toward finding a cure. I will then screen the drug database for compounds that could act effectively on the protein that accounts for the negative reactions to gluten products in Celiac patients. This small molecule compound will hopefully bind to the protein in place of the natural amino acid resides of the gliadin peptide. I will then report my findings.
 * Objective:**

Celiac Disease (CD) is also known as celiac sprue or gluten sensitive enteropathy. It is relatively common, especially in the United States where one out of 133 people is affected. This high incidence accounts for the variety of gluten-free food products often found in grocery stores.
 * Background:**

CD is a chronic inherited autoimmune condition that results in abnormal reactions to gliadin, which is found in gluten products such as wheat, rye, barley, and triticale. Because it is not caused by an overreaction to allergens seen in allergic reactions, it cannot be classified as such. As a result, unlike people who may grow out of a food allergy, people with Celiac Disease will be affected for their whole life.

Symptoms of the disease are triggered by consumption of gluten. When individuals with this condition eat gluten, their bodies react in a way that causes their immune systems to damage the villi of their small intestines.

These damaged villi and microvilli are subsequently unable to effectively absorb the nutrients they need from the food they eat. Proteins, carbohydrates, fats, vitamins, and minerals cannot be taken up (water and bile salts are sometimes excluded as well, but it depends on the case). As a result, if CD is not treated, it may be life-threatening and increase the individual’s risk of other disorders that may arise from the poor nutrition they receive. These conditions include: Although celiac disease is a genetic disease, its onset can occur at any time in life. However, genetic testing alone may not be accurate in diagnosing it. Instead, there are 5 recommended blood tests that may be found outlined on the Celiac Disease Foundation website that are more accurate. Each tests for the presence of a different antibody in the blood after the consumption of gluten. This, along with the genetic testing results and possible symptoms, may provide an accurate diagnosis.
 * Iron deficiency anemia
 * Early onset osteoporosis or osteopenia
 * Vitamin K deficiency associated with risk for hemorrhaging
 * Vitamin and mineral deficiencies
 * Central and peripheral nervous system disorders – usually due to unsuspected nutrient deficiencies
 * Pancreatic insufficiency
 * Intestinal lymphomas and other GI cancers (malignancies)
 * Gall bladder malfunction
 * Neurological manifestations

Because there is currently no drug that has been found to prevent the autoimmune response associated with Celiac Disease, the only effective treatment is a gluten-free diet as even the smallest amounts of it will be damaging. When the diet is freed of gluten products, the small intestine will begin to heal itself and eventually return to normal function. The time period of this process, however, varies by individual and may take up to several years.
 * Current treatment:**

Treatments for this condition are being studied. Ideas include enzymes that work to break down and detoxify gluten before it reaches the small intestine. If this succeeds, patients would be able to eat gluten products as long as they also have access to this enzyme or enzyme complex.

Celiac Disease is often associated with the presence of HLA-DQ2, which is a serotype group in the HLA-DQ serotyping system. A serotype is a group of cells (such as bacteria or viruses, but includes immune cell types, such as the gliadin of gluten proteins) characterized by a set of antigens (parts of the cell that provoke the production of specific antibodies to fight it in the human body). When these cells enter the body, the immune system responds by generating small molecules called antibodies, or blood proteins that recognize and attach to their corresponding antigens and mark them for elimination. In the case of Celiac Disease, the consumption of gluten, which has HLA-DQ2, produces antibodies and elicits an unnecessary immune response, which leads to damage to the villi in the small intestine.
 * Target information:**



The gliadin peptide that serves as the antigen that causes the reaction to gluten is found in many different species of the tribe Triticeae. It is resistant to various proteases and peptidases in the intestinal system.

If a small molecule compound that can bind to the protein in place of the amino acid residues of the gliadin peptide (and with higher affinity), this may prevent the body from recognizing the protein as gluten and overreacting to it. If the compound can prevent an immune response, the villi of the small intestine will not be damaged and Celiac Disease symptoms can be prevented even with the consumption of gluten products. The discovery of a compound that can serve this function will be the goal of this project.
 * Proposed Treatment:**

1. The crystal structure of HLA-DQ2 containing the gliadin peptide was found on the PBD. The PBD ID is 1S9V. 2. Chain B of the protein was removed in PyMol. 3. The validity of the new protein was determined with MolProbity: Go to MolProbity website – TIP: use Internet Explorer ‘**Choose File**’ - Load **your PDB** file to the website (use the file you have from above instead of entering the PDB identifier directly into MolProbity) Copy down the resolution (to put in your table below) Add Hydrogens, On next page, Choose the option to Add the flips ‘**Asn/Gln/His flips’** Leave all the defaults checked and ‘**regenerate H’s’** If it asks you to save the file with Hydrogens – skip it for now. ** Analyze all-atom contacts and geometry ** Use all defaults **>> Run Programs to Perform these Analyses** ** View the Multi-Criterion Chart ** Save the data for your table below View the **Multi-Criterion Kinemage** to see what type of errors exist in the active site. Make note of which type they are for your table below. HINT: to see the active site – toggle on and off the **‘hets’** button ** Include ** a ‘snip’ of your traffic light table from MolProbity (i.e. green, yellow, red table) ** Also include ** these values for the structure in your notebook
 * Protocol:**
 * 1S9V: Together **
 * ** Values ** || ** Percentages ** ||
 * ** Resolution ** || 2.22Ǻ ||  ||
 * ** Clashscore, all atoms ** || 8.45 ||  ||
 * ** Poor rotamers ** || 8 || 2.38% ||
 * ** Ramachandran outliers ** || 2 || 0.55% ||
 * ** Ramachandran favored ** || 347 || 95.59% ||
 * ** Cβ deviations >0.25Å ** || 0 || 0% ||
 * ** MolProbity score^ ** || 2.05 ||  ||
 * ** Residues with bad bonds ** || 0/1480 || 0% ||
 * ** Residues with bad angles ** || 1.1843 || 0% ||
 * ** Error in Active Site ** || 1 ||  ||

4. Once the protein crystal structure was deemed to be accurate, it was transferred to the DDFE through WINSCP under the folder Celiac1. 5. The ligand library to be screened was selected from /home/chem204/DatabasesVDS/LabVS3_PTP1bLibrary ( CB1k_10) and transferred to the Celiac 1 folder as well. 6. The number of ligands in the library was determined by running the countsdf.pl file found in the LabVS3_Library folder with the command perl countsdf.pl. 7. Positive and negative controls were added to the library. Three positive control ligands were found from [], and are compounds that are known to be inhibitors of enzymes similar to 1S9V.
 * 44437812 ||
 * Ki value || 300,000.0 nM ||
 * Molecular Weight || 839.934 g/mol ||
 * Molecular Formula || C40H57N9O11 ||
 * XLogP3-AA || -2.7 ||
 * H-Bond Donor || 8 ||
 * H-Bond Acceptor || 12 ||


 * 44437813 ||
 * Ki value || 40,000.0 nM ||
 * Molecular Weight || 853.961 g/mol ||
 * Molecular Formula || C41H59N9O11 ||
 * XLogP3-AA || -2.3 ||
 * H-Bond Donor || 8 ||
 * H-Bond Acceptor || 12 ||

8. The identified ligands, as well as a random control aspirin ligand and the extracted original, were downloaded as 2D compounds. They were converted to 3D at a pH = 7.4 on OpenBabel. The resulting compounds were verified through PyMol. 9. A gold.conf program file was generated with the HERMES interface. ** Connecting to the graphical interface for GOLD ** Make remote connection to DDFE using a graphical user interface (GUI) for GOLD Open Xming server Go to Start, Programs, Xming, Xming Open Xlaunch Go to Start, Programs, Xming, XLaunch Select ‘Multiple Windows’ Select ‘Start no client’, Skip next screen by selecting ‘Next’ then ‘Finish’ on next screen Open **Putty** in Programs Connect to Host Name: __ddfe.cm.utexas.edu__ on Port __22__ using __SSH__ On the left side of the window, Select the ‘SSH’ tab and then the ‘X11’ or ‘Tunnels’ tab ‘__Enable X11 forwarding’__ X display location: leave blank or enter __localhost:0__ # this is the default display on your computer ‘Open’ Login as user: type your user name for the DDFE (your UTEID) Enter password You must put yourself into the directory where your protein file is. Type ‘ls’ to see the contents and ‘cd’ to change directories Then type this to open gold with the graphical user interface: ** $gold ** Ignore the ‘BadFont’ error message, if present Don’t load a Conf file at the top (that is what you will be making here) Step through the Configuration Options to set up your file Skip Wizard Skip Templates ** Protein > Load protein ** .pdb Gold 5.0 has a separate window for Global Options and a specific window for operations on your protein. ** Under the tab for __your PDB__ file name (to the right) – the tab may just say “ID”: ** ** Protonation & Tautomers ** > __Add Hydrogens__ __Write down__ how many hydrogens added. (2253 hydrogens added) Skip - **Flip Asn GLn and HIS tautomers**. We won’t worry about these right now. ** Extract/Delete Waters ** : ‘Delete Remaining Waters’ (don’t select any of them to save). __Write down__ how many waters removed. (89 waters removed) ** Delete Ligand ** NOTE: If there is more than one ligand, you will need to go into the **Hermes** visualizer window to figure out which ligand Go to __View__ >> __Protein Explorer__ Click on the ‘+’ (plus sign) to see the different objects. Extract the Ligand by clicking the ID tab and “delete ligands”, save as ‘**LigandExtracted.mol2’** (this will be saved for defining the cavity site) __ NOTE: __ you already saved a different PubChem version of this ligand for validation docking **Back in WinSCP** – __make sure your LigandExtracted has an extension__ If not, then add it to the file (just add **.mol2** to the end) Skip the remaining options for the protein. ** Under Global Options: ** Define Binding Site –‘Select One or more ligands’ ‘__One or more ligands’__ - choose the single ligand that you had extracted. ‘Select all atoms within 7.__5 Angstroms__ Leave ‘Generate a cavity’ unchecked Check – ‘Detect cavity’ Check – ‘Force all H bond donors/acceptors ….” – verify active site in image on the Hermes visualizer (only a small region around the ligand of the protein will be highlighted in gray) In the Gold GUI – go back to **Global Options** Select Ligands – you will need to find where your library is (probably in your Celiac1 directory) e.g. **All_pH.sdf** This is the file you need to link to for your ligand library. Then make sure the number of conformations per ligand or **GA Runs** is set to ‘10’ Skip the Reference Ligand Skip ‘Configure Waters’ Skip ‘Ligand Flexibility’ Leave the defaults for ‘Fitness & Search Options’ - it will use CHEMPLP scoring function. ‘GA Settings’ – 10% Output Options Output directory: leave as it is (‘.’) UNCHECK – save ligand rank (.rnk) files UNCHECK – save ligand log files UNCHECK – save initialized ligand files Save solutions to one file: ‘YourTargetvsYourLibraryRun1.sdf’ e.g. “vsCB1kRun1.sdf” bestranking_list_filename ‘BestYourTargetvsYourLibraryRun1.lst’ e.g. “BestvsCB1kRun1.lst” Skip ‘Information in File’ and ‘Selecting Solutions” (We will keep all solutions) Skip GoldMine Skip Parallel GOLD – we will run in parallel but it will be executed remotely instead of at this console Skip ‘Constraints’ ‘Atom Typing’ - Automatically **set atom and bond types (for the ligand only)**: Make sure only one box is checked - ‘Ligand’ only At the top of the page hit Save Hit ‘Finish’ to save the file Save GOLD conf file as **gold.conf** Save protein as <**PDBname>protein.mol2** Then close GOLD/Hermes
 * 44437814 ||
 * Ki value || 200,000.0 nM ||
 * Molecular Weight || 867.987 g/mol ||
 * Molecular Formula || C42H61N9O11 ||
 * XLogP3-AA || -2 ||
 * H-Bond Donor || 8 ||
 * H-Bond Acceptor || 12 ||

10. The generated gold.conf file was verified and the autoscale and radius were changed to 1 and 18 A, respectively. 11. The ligands were concatenated with the cat CID_44437812_3DpH.sdf CID_44437813_3DpH.sdf CID_44437814_3DpH.sdf CID_2244aspirin_pH.sdf OriginalPeptidepH.sdf >> **LibrarypH.sdf** command. 12. The scriptgoldscanthisjob.sh script file was transferred from the /home/chem204/scripts directory to the Celiac1 folder. 13. The 1005 ligands were screened with 201 processors and the results were concatenated.