Header image
line decor
            HOME  ::  DOCUMENTATION  ::  I/O FORMATS  ::  GALLERY  ::  CONTACT
line decor

Reference

If you use the results of the CS23D2.0 server in a publication, please cite the following paper:
Wishart DS, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G. CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W496-502.

Tables

Table 1: RMSDs of distorted X-ray structures before and after chemical shift refinement. The high quality X-ray structures of 9 proteins, for which complete chemical shifts were also available, were distorted by randomly varying all phi and psi angles uniformly within ±5°. The distorted structures were then minimized using chemical shift refinement with GAFolder for 300 iterations. Corrected chemical shifts from RefDB were used.

Protein Name

BMRB ID

X-ray PDB ID

Backbone RMSD with respect to native structure (Å)

Before minimization

After minimization

Barnase

BMR4964

1A2P

2.46

download

1.44

download

Ribosomal Protein L25

BMR4395

1DFU

2.65

download

1.04

download

Melanoma Inhibitory Activity Protein

BMR5220

1I1J

3.98

download

3.73

download

Calbindin

BMR6699

1IG5

1.43

download

1.31

download

Ubiquitin

BMR5387

1UBQ

2.45

download

1.85

download

Ribosomal Protein L30e

BMR5485

1W41

2.76

download

2.06

download

CI-2

BMR4974

2CI2

1.83

download

0.79

download

Vts1:RNA Dimer

BMR6922

2D3D

2.2

download

1.92

download

BPTI

BMR4968

6PTI

3.2

download

0.79

download



Table 2: Results of modeling and refining 9 structures and comparison with native X-ray and NMR structures. Native structures were excluded from CS23D2.02.0 databases during model generation. Refinement for each model involved 10 runs of 300 iterations each, and selection of the model with the lowest energy.

# Rama violations - indicates the number of residues that fall in the generously allowed and disallowed regions of the Ramachandran plot as adapted from Lovell SC, et al. Structure validation by Calpha geometry: phi,psi and Cbeta deviation.
Proteins 2003 Feb 15;50(3):437-50. The fewer violations, the better the structure. Ideally there should be no violations.

# Omega violations - indicates the number of residues that have Omega angles that are not in the range 180.0 ± 12.5 degrees or 0.0 ± 15.0 degrees. Ideal Omega angles for
trans peptide bonds are 180 degrees while Ideal Omega angles for cis peptide bonds are 0 degrees.

Mean chemical shift correlation - indicates the Pearson correlation coefficient calculated between input chemical shifts and those calculated by SHIFTX for the structure generated by CS23D2.0. Values closer to 0.9 are optimal. Values less than 0.7 indicate possible problems with the structure.

HCount (%) - indicates the percentage of residues in the protein that are involved in canonical hydrogen bonds. Good structures have higher percentages of "good" or canonical hydrogen bonds. Values approaching 80% are ideal. To ensure consistency, hydrogen atoms for all structures were replaced by those from Reduce v3.10 (Word, et al.(1999) "Asparagine and glutamine: using hydrogen atom contacts in the choice of sidechain amide orientation"
J. Mol. Biol. 285, 1733-1745).

Average H-bond energy - indicates the average H-bond energy calculated in kcal/mol (using the DSSP definitions of Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.
Biopolymers. 1983 Dec;22(12):2577-637). Average H-bond energy indicates how close H-bonds are to ideal geometry. The lower the average H-bond energy, the better the structure is.

Protein Name PDB ID # Rama violations # Omega violations Mean chemical shift correlation HCount (%) Average H-bond energy (kcal/mol)
Barnase (X-ray) 1A2P 0 7 0.83 68% -1.68
Barnase (NMR) 1FW7 1 27 0.77 62% -1.45
Barnase (CS23D2.0) 0 13 0.77 65% -1.62
CI-2 (X-ray) 2CI2 1 0 0.77 71% -1.88
CI-2 (NMR) 3CI2 4 0 0.64 56% -1.75
CI-2 (CS23D2.0) 1 1 0.69 70% -1.50
Ubiquitin (X-ray) 1UBQ 0 0 0.84 83% -1.83
Ubiquitin (NMR) 1XQQ 4 21 0.77 75% -1.82
Ubiquitin (CS23D2.0) 0 4 0.77 79% -1.95
DNAB Helicase (X-ray) 1B79 1 0 0.66 84% -1.19
DNAB Helicase (NMR) 1JWE 0 18 0.67 83% -1.33
DNAB Helicase (CS23D2.0) 0 7 0.67 83% -1.36
Interleukin-4 (X-ray) 1RCB 4 1 0.67 78% -1.18
Interleukin-4 (NMR) 1BCN 3 0 0.54 66% -0.91
Interleukin-4 (CS23D2.0) 1 1 0.54 70% -0.97
BPTI (X-ray) 6PTI 1 0 0.79 65% -1.80
BPTI (NMR) 1OA5 4 0 0.68 57% -1.74
BPTI (CS23D2.0) 1 0 0.70 64% -1.82
Calbindin (X-ray) 1IG5 1 0 0.79 81% -1.24
Calbindin (NMR) 1CLB 4 9 0.74 75% -1.52
Calbindin (CS23D2.0) 0 4 0.74 80% -1.52
Vts1:RNA Dimer (X-ray) 2D3D 0 3 0.77 88% -1.35
Vts1:RNA Dimer (NMR) 2FE9 1 0 0.66 84% -1.19
Vts1:RNA Dimer (CS23D2.0) 1 0 0.67 83% -1.26
Ribosomal Protein L25 (X-ray) 1DFU 1 0 0.76 78% -1.73
Ribosomal Protein L25 (NMR) 1B75 9 5 0.56 61% -1.69
Ribosomal Protein L25 (CS23D2.0) 7 10 0.59 74% -1.73


Table 3. Performance of CS23D2.0 in generating 3D structures for 62 randomly selected BMRB files. The given RMSDs are those of the final structure relative to the known (X-ray or NMR) structure.

BMRB ID

Protein Name

RMSD (Å)

Runtime

(min)

Type

All atoms

C-alpha

BMR2208

Cyclophiliin

0.98

0.39

33

Fragment Assembly +

Homology Modeling

BMR4022

Carbonic Anhydrase

1.58

0.81

48

Fragment Assembly +

Homology Modeling

BMR4031

Ribonuclease A

1.69

0.40

9

Fragment Assembly +

Homology Modeling

BMR4035

Regulatory Protein E2

1.67

0.58

6

Fragment Assembly +

Homology Modeling

BMR4046

Drosophila Heat Shock Factor

3.21

2.02

8

Fragment Assembly +

Homology Modeling

BMR4047

HU Protein

2.24

0.84

7

Fragment Assembly +

Homology Modeling

BMR4053

Staphylococcal Nuclease

1.00

0.22

12

Fragment Assembly +

Homology Modeling

BMR4064

Collagenase

0.97

0.30

13

Fragment Assembly +

Homology Modeling

BMR4070

FimC

2.28

1.73

37

Fragment Assembly +

Homology Modeling

BMR4072

RNA-1 Mod Protein

3.99

3.13

4

Fragment Assembly +

Homology Modeling

BMR4076

Outer Surface Protein

2.21

2.04

19

Fragment Assembly +

Homology Modeling

BMR4077

FK506 Binding Protein

0.93

0.41

7

Fragment Assembly +

Homology Modeling

BMR4081

Interferon Alpha

1.14

0.38

9

Fragment Assembly +

Homology Modeling

BMR4082

Profilin

0.99

0.51

9

Fragment Assembly +

Homology Modeling

BMR4083

CheY

1.18

0.47

8

Fragment Assembly +

Homology Modeling

BMR4084

HnRNP

0.77

0.10

15

Fragment Assembly +

Homology Modeling

BMR4090

Basic FGF

3.34

1.75

6

Fragment Assembly +

Homology Modeling

BMR4091

Basic Fibroblast Growth Factor

1.64

0.41

12

Fragment Assembly +

Homology Modeling

BMR4092

Core Binding Factor

1.84

0.89

9

Fragment Assembly +

Homology Modeling

BMR4094

Interleukin 4

0.64

0.22

10

Fragment Assembly +

Homology Modeling

BMR4098

Fatty Acid Biniding Protein

3.39

1.96

10

Fragment Assembly +

Homology Modeling

BMR4116

Im9 Immunity Protein

0.81

0.23

6

Fragment Assembly +

Homology Modeling

BMR4117

Elongation Factor-1 Beta

2.18

1.35

7

Fragment Assembly +

Homology Modeling

BMR4132

Human UBC9

0.48

0.12

25

Fragment Assembly +

Homology Modeling

BMR4127

DNA-binding Domain of SV40 T Antigen

2.61

1.57

11

Fragment Assembly +

Homology Modeling

BMR4186

CRABPII

1.12

0.55

11

Fragment Assembly +

Homology Modeling

BMR4198

Syntaxin 1A, N-domain

3.48

2.66

9

Fragment Assembly +

Homology Modeling

BMR4285

Apo-S100A1

3.05

1.98

18

Fragment Assembly +

Homology Modeling

BMR4296

Cold Shock protein

1.49

0.57

6

Fragment Assembly +

Homology Modeling

BMR4297

DnaB (24-136)

1.96

0.88

9

Fragment Assembly +

Homology Modeling

BMR4307

Hamster Prion

3.22

1.67

19

Fragment Assembly +

Homology Modeling

BMR4326

DNA polymerase B

2.38

1.74

6

Fragment Assembly +

Homology Modeling

BMR4334

ARID Domain of Dead-Ringer Protein

3.26

2.53

10

Fragment Assembly +

Homology Modeling

BMR4340

Major Urinary Protein

0.75

0.33

14

Fragment Assembly +

Homology Modeling

BMR4354

Maltose Binding Protein

0.99

0.39

49

Fragment Assembly +

Homology Modeling

BMR4380

Epithelial Cadherin. N-domain

2.41

2.10

7

Fragment Assembly +

Homology Modeling

BMR4403

PyJ

2.96

1.96

56

Threading

BMR4405

KH3

1.84

0.96

6

Fragment Assembly +

Homology Modeling

BMR4430

Calcyclin

3.72

2.44

84

Fragment Assembly +

Homology Modeling

BMR4554

Dihydrofolate Reductase

1.13

0.50

12

Threading

BMR4562

Lysozyme

0.84

0.27

9

Fragment Assembly +

Homology Modeling

BMR4577

Ribosomal Protein S4 delta 41

2.35

1.42

22

Fragment Assembly +

Homology Modeling

BMR4675

Foxo4

3.58

2.61

27

Fragment Assembly +

Homology Modeling

BMR4840

Mycobacterium tuberculosis adenylate kinase

2.32

1.38

87

Fragment Assembly +

Homology Modeling

BMR4964

Barnase

0.73

0.31

9

Fragment Assembly +

Homology Modeling

BMR4968

Bovine Pancreatic Trypsin Inhibitor

1.05

0.29

4

Fragment Assembly +

Homology Modeling

BMR4974

CI-2 from Barley Seeds

0.99

0.15

5

Fragment Assembly +

Homology Modeling

BMR5220

Melanoma Inhibitory Activity Protein

2.74

1.27

9

Fragment Assembly +

Homology Modeling

BMR5316

HIV-1 Gag

1.95

1.31

21

Fragment Assembly +

Homology Modeling

BMR5387

Ubiquitin

2.37

1.03

6

Fragment Assembly +

Homology Modeling

BMR547

Calmodulin

3.24

2.81

13

Fragment Assembly +

Homology Modeling

BMR5473

Ovomucoid Third Domain

1.41

0.64

4

Fragment Assembly +

Homology Modeling

BMR5690

Cobra Neurotoxin II

1.60

0.89

5

Fragment Assembly +

Homology Modeling

BMR5899

Spo0f protein

2.58

1.88

9

Fragment Assembly +

Homology Modeling

BMR6338

At1G77540

---

---

---

De Novo (did not converge)

BMR6357

Beta Lactamase

1.25

0.28

25

Fragment Assembly +

Homology Modeling

BMR6367

YhgG

---

---

---

De Novo (did not converge)

BMR6410

Glutaredoxin 1

3.13

2.21

44

Threading

BMR6444

MMP12

1.55

0.64

13

Fragment Assembly +

Homology Modeling

BMR6699

Calbindin D9K

3.02

1.92

5

Fragment Assembly +

Homology Modeling

BMR6821

Superoxide dismutase

1.75

1.28

13

Fragment Assembly +

Homology Modeling

BMR6922

Vts1 SAM domain

2.16

1.16

7

Fragment Assembly +

Homology Modeling

 

 

 

 



Average

 

1.95

1.11

10 (median)




Table 4. Backbone RMSD between CS23D2.0 models and corresponding experimental structures for recent entries in BMRB database (as of the first week of April 2008).

Protein name PDB ID BMRB ID RMSDbb (Å) Method
TnpE 2RN7A 15701 Did not converge De Novo
Chromo domain of chromobox 2K28A 15695 1.34 Fragment Assembly + Homology modeling
SeR13 2K1HA 15678 2.31 Fragment Assembly + Homology modeling
Chromo domain of the Chromobox 2K1BA 15674 1.22 Fragment Assembly + Homology modeling
EwR120 2JZAA 15611 1.72 Fragment Assembly + Homology modeling
SgR42 2JZ2A 15604 4.88 Threading
ER541-37-162 2K1G 15603 Did not converge Threading
alpha-RgIA 1CNL 15586 1.69 Fragment Assembly + Homology modeling
Ig domain of human palladin 2DM2A 15512 1.29 Threading
Human NEMO zinc finger 2JT3A 15499 1.81 Fragment Assembly + Homology modeling
NEMO zinc finger 2JVXA 15499 2.85 Fragment Assembly + Homology modeling
Abl kinase 2HYYA 15488 1.76 Fragment Assembly + Homology modeling
YmoA 2JVPA 15486 2.15 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15436 1.69 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15435 1.69 Fragment Assembly + Homology modeling
apo-LFABP 2JU3A 15429 2.32 Fragment Assembly + Homology modeling
Human Ubiquitin 1UBQ 15410 1.05 Fragment Assembly + Homology modeling
RGD-hirudin 2JOOA 15405 1.32 Fragment Assembly + Homology modeling
MMP-3 2JT6 15396 1.21 Fragment Assembly + Homology modeling
MMP-3 2JT5A 15395 1.17 Fragment Assembly + Homology modeling
F153W cardiac troponin C 2JT3A 15388 3.68 Fragment Assembly + Homology modeling
Four-Alpha-Helix Bundle 2I7UA 15384 1.48 Threading
Huwentoxin-XI 2JOTA 15382 1.51 Fragment Assembly + Homology modeling
VraR DNA Binding Domain 2RNJA 15378 0.93 Fragment Assembly + Homology modeling
Human DESR1 2JR7A 15377 1.81 Fragment Assembly + Homology modelling
Cys2His2 zinc finger 2JSPA 15373 Did not converge De Novo
alpha-RgIA 1CNL 15368 1.69 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15367 1.69 Fragment Assembly + Homology modeling
MMP20 2JSDA 15361 1.14 Fragment Assembly + Homology modeling
MEKK3 PB1 domain 2PPHA 15355 1.24 Fragment Assembly + Homology modeling
1st SH3 domain of adaptor Nck 2JS2A 15351 1.57 Fragment Assembly + Homology modeling
2nd SH3 domain of adaptor Nck 2JS0A 15349 2.05 Fragment Assembly + Homology modeling
Manduca sexta moricin 2JR8A 15323 2.66 Fragment Assembly + Homology modeling
CheW 2HO9A 15322 2.66 Threading
Discoidin Domain of DDR2 2Z4FA 15315 1.63 Threading
NusB from Aquifex Aeolicus 2JR0 15312 3.91 Threading
C2 domain of human Smurf2 2JQZA 15306 Did not converge Threading
CTX A3 1I02A 15305 2.20 Fragment Assembly + Homology modeling
Cu(I) human Sco2 2RLIA 11001 2.19 Fragment Assembly + Homology modeling


Table 5. Backbone RMSDs between models generated by CS23D2.0 sub-program for ab initio folding and corresponding experimental structures.

Protein PDB ID RMSDbb (Å)
HIS15ASP 1CM2 0.98
TM1442 1SBO 2.56
XcR50 1TTZ 1.31
CSPA 1MJC 2.77
Ubiquitin 1UBQ 2.5
Apo-lfabp 1LFO 1.68
DNA Polymerase 1DK3 2.71
OMTKY3 1OMT 1.83
BPTI 5PTI 1.94
Calbindin D9k 1IG5 3.1
Rosetta 2.2.0 was used to provide CS23D2.0 with ab initio folding functionality

Table 6: Ubiquitin models built with CS23D2.0 using (a) homology modelling, (b) threading, and (c) ab initio folding. RMSDs and sequence identity values were calculated with respect to the native structure (1UBQ). Tables (a) and (b) show the ubiquitin models built as ubiquitin homologs were progressively removed from the database, so that the homolog used in a given row is the best one found by CS23D2.0 when the more closely matching homologs above it were removed. (c) Results of ab initio folding of ubiquitin by Rosetta. 50 models were produced in each of 10 runs. Backbone RMSDs of the model with the best CS23D2.0 score are shown, with those less than 2.8 Å labeled as successes. Homologous proteins were excluded from Rosetta's fragment generation step.


(a) Homology Modelling

Template RMSD (Å) Sequence Identity (%)
All atoms C-alpha
1UBQA 0.00 0.00 100.00
1UD7A 2.18 0.85 90.79
1C3TA 2.09 0.60 89.47
1SIFA 1.35 0.41 90.14
1BT0A 1.55 0.52 61.64
1NDDA 1.43 0.49 56.76
2DZIA 2.43 0.93 39.47
2JWZA 2.01 1.00 94.74
2BKRB 1.96 0.50 57.89
2NVUJ 2.03 1.07 57.89

(b) Threading

Template RMSD (Å) Sequence Identity (%)
All atoms C-alpha
1UELA 1.99 0.79 31.58
1MG8A 2.43 1.41 31.58
1IYFA 3.17 1.99 30.26
1WY8A 4.47 4.16 39.47
1WX9A 2.52 1.16 34.21
2BWFA 4.36 3.98 32.89
1OQYA 3.68 2.49 35.53
2HJ8A 2.73 1.84 32.00
1WE7A 4.85 4.16 27.63
2FAZA 5.20 4.83 36.84
1WH3A 2.14 0.86 35.53
1Z2MA 2.30 0.96 32.89
1TTNA 2.76 1.60 26.32
1M94A 2.12 0.97 23.29

(c) Ab initio

Run Lowest Score Backbone RMSD (Å) Results
1 2.80 FAILED
2 4.39 FAILED
3 3.36 FAILED
4 2.92 FAILED
5 5.26 FAILED
6 3.99 FAILED
7 2.72 SUCCESS
8 2.55 SUCCESS
9 2.82 FAILED
10 2.78 SUCCESS




Table 7. Comparison of model global accuracy achieved by CS23D2.0 and similar programs.



(a). Comparison of backbone RMSDs of models generated by CS-ROSETTA, CHESHIRE, and CS23D2.0.

Protein Name PDB ID BMRB ID CS-ROSETTA RMSDbb (Å) CHESHIRE RMSDbb (Å) CS23D2.0 RMSDbb (Å)
TM1442 1SBO 5921 1.22 1.32 1.84
Calbindin 4ICB 390 1.39 1.47 2.41
Hpr 1HDN 2371 1.99 1.83 0.96
Ubiquitin 1UBQA 5387 0.82 1.33 1.34
FF domain 2UCZ 6711 1.46 1.12
HR2106 2HZ5 6210 1.85 2.13
Spo0F 1SRR 5899 1.67 1.88
Apo 1fabp 1LFO 4098 1.72 2.11
CspA 1MJC 4296 1.43 0.75
XcR50 1TTZ 6363 1.67 2.41
Profilin 1PFL 4082 2.26 1.8
GB3 2OED 15283 0.74 0.98


(b) Comparison of backbone RMSDs of models generated by CS-ROSETTA and CS23D2.0 for proteins that did not meet CS-ROSETTA convergence criteria.

Protein Name PDB ID BMRB ID CS-Rosetta RMSDbb (Å) CS23D2.0 RMSDbb (Å)
HI0719 1J7H 5606 14.31 1.02
MTH1598 1JW3 5165 12.17 Did not converge
HR1958 1TVG 6344 16.29 Did not converge
CcR19 1T17 6120 7.09 4.17
YwIE 1ZGG 6460 9.37 1.37
Flua 1N0S 5756 15.57 1.45
nsp1 2GDT 7014 5.6 Did not converge


(c) Comparison of backbone RMSDs of models generated by CS-ROSETTA and CS23D2.0 for structural genomics proteins.

Protein Name PDB ID BMRB ID CS-Rosetta RMSDbb (Å) CS23D2.0 RMSDbb (Å)
RpT7 2JTV 15419 0.64 3.79
StR82 2JT1 15386 0.57 3.45
RhR95 2JVM 15341 0.66 2.77
TR80 2JXT 15573 1.27 Did not converge
VfR117 2JVW 15491 0.6 6.79
PsR211 2JVA 15471 2.07 3.49



Figure 1. Dependence of backbone RMSD of CS23D2.0 models on template sequence identity for different methods of structure generation

A scatter plot showing the performance of CS23D2.0´s three approaches to structure generation (subfragment assembly + homology modeling, chemical shift threading and ab initio structure generation) relative to the level of sequence identity of the matching templates or subfragments. Data from Tables 3-6 in the CS23D2.0 web documentation pages was used to assemble this graph.

Header image

Energy Function

Summary

CS23D2.0 uses an energy function consisting of several statistics-based pseudo-potentials:

  • Chemical shift correlation scores for HA, HN, N, CA, CB, and CO chemical shifts. Chemical shifts for the CS23D2.0 models are predicted using SHIFTX, and, after subtraction of random coil shifts for CA and CB shifts, correlation coeffecients are determined between these predicted chemical shifts and the experimental chemical shifts submitted by the user.
  • A threading potential.
  • Bump score for backbone atoms plus C-Beta and C-Gamma atoms.
  • Hydrogen bond count.
  • Ramachandran score counting phi and psi torsion angle violations.
  • Omega score counting omega angle violations.
  • Chi score based on expected chi angles for different phi and psi combinations.
  • Radius of gyration, giving a penalty if the calculated radius of gyration deviates from the expected value by more than 10%.
  • Hydrogen bond energy.
  • Disulphide bond count.
  • Secondary structure score, based on the similarity between the secondary structure of the CS23D2.0 3-D model and the expected secondary structure. The expected secondary structure is normally based on that predicted by a sequence homology match to the PPT-DB 2º Structure (Cytoplasmic) database. If, however, this predicted secondary structure is more than 50% different from the secondary structure predicted from the chemical shifts by CSI, the CSI-predicted secondary structure is used as the expected secondary structure.
  • Score Weighting

    Each of the initial terms in the energy function are transformed through a series of steps: (1) They are multiplied so that they behave as energy-like terms, i.e. more negative scores indicate a better structure. (2) They are scaled so that they all have close to the same order of magnitude (about ± 1 to 100). (3) All scores except for the chemical shifts scores and the radius of gyration score are normalized by dividing by the number of residues in the peptide, and then multiplying by 100 to re-scale them. (4) All scores are multiplied by weights that were determined based on training of the energy function on CASP7 decoys and decoys generated by Rosetta.

    The initial score calculations, plus the results of steps (1) and (2) are:

  • Each of the Chemical shift correlation values (0.0 to 1.0) is multiplied by -10.0.
  • The threading potential is multiplied by 0.2.
  • In the initial bump score, for all pairs of atoms that are closer than the allowed threshold distances, the magnitude of the difference between the observed distance and the threshold, in Angstroms, is added to the score. Any distances closer than 70% of the threshold receive an additional penalty of +10. The initial bump score is multiplied by 0.5.
  • The hydrogen bond count (the number of hydrogen bonds) is multiplied by -1.1.
  • The initial Ramachandran score is calculated based on Ramachandran tables with four angle classes: 3 (good), 2 (acceptable), 1 (minimal), 0 (disallowed). For each residue, this class number r is used to generate a Rama score value of (3 - r)², and the sum of all these values over all residues is calculated. To the final score is then added the fraction of residues that did not have an r value of 3.
  • For the omega score, if an omega angle is not within ±12.5° of 180° or within ±15° of 0°, then the magnitude of the difference between the omega angle and these thresholds is added to the omega score.
  • The chi score is based on the fact that chi angles cluster into three main groups for each residue. Each chi angle is assigned to the chi group to which it is closest, and the probability of observing that chi group given the phi and psi angles of the residue is subtracted from the chi score.
  • For the radius of gyration score, if the calculated radius of gyration is not within 10% of the expected radius of gyration, the magnitude of the difference between the real and expected values is used as the intial radius of gyration score. The initial score is multiplied by 2.5.
  • For the hydrogen bond energy score, the sum of all the hydrogen bond energies is calculated. The sum is divided by the number of residues and multiplied by 40.0.
  • The disulphide bond count is multiplied by -3.0.
  • For the secondary structure score, the sum of all secondary structure mismatch scores in the peptide is calculated: +4 for Beta-strand or helix matched with coil, and +8 for Beta-strand residues matched with helix residues.

  • The weightings for step (4) for the above scores, other than the chemical shift scores, were trained on decoys from CASP7 using linear regression-based fitting. The weights for the chemical shift scores were trained separately on Rosetta decoys of structures with known chemical shifts; both ab-initio Rosetta decoys and Rosetta decoys generated from relaxing the native structures. The relative weighting between the chemical shift scores and the other scores was then determined based on linear regression. The weights in step (4) are multiplied by the values in step (3) to obtain the final scores. The step (4) weights are:

    Score name Score weight
    HA 1.22
    HN 0.99
    N 0.71
    CA 0.62
    CB 1.00
    CO 0.41
    Thread 0.46
    Bump 0.07
    Hcount 0.05
    Rama 0.05
    Omega † 0.02
    Chi † 0.02
    RadGyr 0.57
    HBenergy 0.15
    Disulph 0.55
    SecStr 0.23

    † Adjusted manually to a low value because training set it to 0.

    CS23D2.0 Results Assessment

    A CS23D2.0 results summary for a structure prediction includes the mean chemical shift correlation values before and after energy minimization. The reliability of the final structure is guaged by looking at the final mean chemical shift correlation:
       0.75 - 1.00 = High
       0.65 - 0.75 = Good
       0.55 - 0.65 = Moderate
       0.00 - 0.55 = Poor
    
    A CS23D2.0 results summary also prints information on torsion angle violations. The expected number of torsion angles in the core, allowed, generous and disallowed regions is estimated based on the following percentages:
       #res in phi/psi core          90%
       #res in phi/psi allowed        7%
       #res in phi/psi generous       1%
       #res in phi/psi disallowed     0%
       #res in omega allowed         99%
       #res in omega disallowed       1%
    
    Note: The first and last residues are not use for phi/psi, and the last residue is not used for omega.

    Torsion Angle Mapping for Thrifty

    CS23D2.0 performs a homolog search based torsion similarity against a non-redundant database of all cytoplasmic proteins in the PDB. This search uses an updated version of Thrifty that assigns each residue a letter based on its torsion angle class. The torsion angle mappings for each of the 9 letters in this torsion angle alphabet are shown below.

    Torsion Angle Alphabet