|
Present dataset |
AG dataset |
|
|||||||||||||||||
No. |
Protein |
PDB |
Class |
Fold |
Lpdb |
L |
pH |
Temp |
Folding |
ln(kf) |
ln(kf) |
ln(kI) |
ln(ku) |
ln(ku) |
βT |
pH |
Temp |
Folding |
ln(kf) |
Comments |
1 |
Apomyoglobin (Whale) [1] |
1A6N |
α |
Globin-like |
151 |
153 |
6.2 |
5 |
N2S |
1.1 |
4.5 |
NA |
-3.8 |
5.5 |
0.72 |
— |
— |
— |
— |
|
2 |
Pit1 [2] |
1AU7 (103–160) |
α |
DNA/RNA-binding 3-helical bundle |
58 |
63 |
5.5 |
25.0 |
N2S |
9.7 |
12.6 |
5.5 |
0.74 |
— |
— |
— |
— |
|||
3 |
4-helix bundle protein FRB [3] |
1AUE (Chain B: 2022–2115) |
α |
Four-helical up-and-down bundle |
94 |
95 |
7.5 |
10 |
N2S |
5.4 |
6.7 |
NA |
-5.2 |
-1.0 |
0.83 |
— |
— |
— |
— |
Both the AG and our datasets adopted the same reference [3]. The ACPro and our datasets reported the same ln(kf) value, but the Garbuzynskiy dataset reported a different value. |
4 |
IM7 [4] |
1AYI (1–86) |
α |
HIV-1 gp41 fragments |
86 |
86 |
7.0 |
10 |
N2S |
5.7 |
6.9 |
8.0 |
-0.84 |
3.0 |
0.90 |
— |
25 |
2S |
7.2 |
The ln(kf) value reported in the AG dataset was based on the 2S model [5]. However, the N2S nature of this protein is well established [4], so that our reported value is based on the N2S model. |
5 |
Apomyoglobin (Horse) [6] |
1DWR (1–152) |
α |
Globin-like |
152 |
153 |
6.0 |
26 |
N2S |
2.9 |
2.9 |
5.3 |
NA |
NA |
NA |
Na |
NA |
NA |
The value of ln(kI) was taken from the rate constant of the fast phase of the bi-phasic refolding kinetics reported. |
|
6 |
Engrailed Homeodomain [7] |
1ENH |
α |
DNA/RNA-binding 3-helical bundle |
54 |
54 |
5.7 |
25.0 |
N2S |
10.6 |
NA |
7.6 |
0.83 |
— |
— |
— |
— |
|||
7 |
FF domain from human HYPA/FBP11 [8] |
1UZC (3–71) |
α |
3-Helical bundle |
69 |
71 |
5.7 |
25 |
N2S |
8.0 |
9.9 |
3.4 |
0.91 |
— |
— |
— |
— |
The same experimental group reported the ln(kf) values at 25°C [8] and at 10°C [9]. The ACPro and our datasets reported the value at 25°C, but the Garbuzynskiy dataset reported the value at 10°C. |
||
8 |
ACBP (Bovine) [10] |
1NTI |
α |
Acyl-CoA binding protein-like |
86 |
86 |
5.3 |
26 |
N2S |
6.5 |
6.4 |
9.3 |
-2.7 |
-2.9 |
0.70 |
— |
25 |
2S |
6.96 |
The ln(kf) value reported in the AG dataset was based on the 2S model [5]. However, the N2S nature of this protein is well established [10], so that our reported value is based on the N2S model. |
9 |
Phage 434 Cro [11] |
2CRO (1–65) |
α |
Lambda repressor-like DNA-binding domains |
65 |
71 |
6.0 |
20 |
N2S |
3.7 |
4.0 |
NA |
-0.39 |
0.54 |
0.91 |
— |
— |
— |
5.35 |
Both the AG and our datasets adopted the same reference [11]. The Garbuzynskiy and our datasets reported the same ln(kf) value, but the ACPro dataset reported a different value. |
10 |
X domain of Measles Virus protein [12] |
1OKS |
α |
Immunoglobulin/albumin-binding domain-like |
49 |
49 |
7.2 |
25 |
N2S |
6.2 |
8.6 |
4.5 |
0.84 |
NA |
NA |
NA |
NA |
|||
11 |
Barstar [13] |
1BTA |
α/β |
Barstar-like |
89 |
90 |
7 |
25 |
N2S |
3.5 |
NA |
-2.3 |
NA |
8 |
— |
— |
3.47 |
We have adopted the data from reference [13], which is more updated than reference [14] adopted by the AG dataset. |
||
12 |
Apoflavodoxin (Anabaena) [15] |
1FTG (2–169) |
α/β |
Flavodoxin-like |
168 |
169 |
7.0 |
25.0 |
N2S |
2.3 |
3.4 |
-3.0 |
0.55 |
— |
— |
2S |
— |
The ln(kf) value reported in the AG dataset was based on the 2S model [16]. However, the N2S nature of this protein is well established [15], so that our reported value is based on the N2S model. Moreover, the intermediate (I) is mostly off-pathway. |
||
13 |
HIV-1 RNase H [17] |
1HRH (427–556) |
α/β |
Ribonuclease H-like motif |
130 |
134 |
5.5 |
25 |
N2S |
0.88 |
NA |
-2.5 |
0.64 |
NA |
NA |
NA |
NA |
|||
14 |
N-PGK (Bacillus stearothermophilus) [18] |
1PHP (1–175) |
α/β |
Phosphoglycerate kinase |
175 |
175 |
7.5 |
25 |
N2S |
2.3 |
NA |
-3.5 |
0.84 |
— |
— |
— |
— |
|||
15 |
C-PGK (Bacillus stearothermophilus) [19] |
1PHP (186–394) |
α/β |
Phosphoglycerate kinase |
209 |
209 |
7.2 |
25 |
N2S |
-4.0 |
NA |
-9.3 |
0.51 |
— |
— |
— |
−3.44 |
The Garbuzynskiy and our datasets have adopted the data from reference [19], which is more updated than reference [18] adopted by the ACPro dataset. |
||
16 |
DHFR [20] |
1RA9 |
α/β |
Dihydrofolate reductase-like |
159 |
159 |
7.8 |
15 |
N2S |
-0.37 |
0.86 |
1.5 |
-5.2 |
-0.29 |
0.92 |
— |
— |
— |
NA |
We have adopted the data from reference [20], which is more updated than reference [21] adopted by the AG dataset. |
17 |
Trp-synthase α-subunit (Escherichia coli) [22] |
1WQ5 |
α/β |
TIM β/α-barrel |
268 |
268 |
7 |
25 |
N2S |
-2.1 |
NA |
-8.9 |
NA |
— |
— |
— |
— |
|||
18 |
RNase H (Escherichia coli) [23] |
2RN2 |
α/β |
Ribonuclease H-like motif |
155 |
155 |
5.5 |
25 |
N2S |
-0.3 |
NA |
-11.4 |
0.80 |
— |
— |
— |
0.0095 |
We have adopted the data from reference [23] although the AG datasets adopted the data from reference [24], because the ln(kf) value reported in reference [24] is not the value in H2O but in D2O. |
||
19 |
CheY [25] |
3CHY |
α/β |
Flavodoxin-like |
128 |
129 |
7.0 |
25 |
N2S |
1.0 |
NA |
-4.4 |
0.70 |
— |
— |
— |
— |
|||
20 |
Apoflavodoxin (Desulfovibrio desulfuricans) [26] |
3F6R (2–148) |
α/β |
Flavodoxin-like |
147 |
148 |
7 |
20 |
N2S |
3.5 |
4.0 |
NA |
-5.8 |
-3.6 |
0.88 |
— |
— |
— |
— |
|
21 |
sIGPS (Sulfolobus solfataricus) [27] |
1IGS (27–248) |
α/β |
TIM β/α-barrel |
222 |
222 |
7.8 |
25 |
N2S |
-4.5 |
NA |
-13.9 |
0.79 |
NA |
NA |
NA |
NA |
|||
22 |
RNase H (Chlorobaculum tepidum) [28] |
3H08 |
α/β |
Ribonuclease H-like motif |
146 |
146 |
5.5 |
25 |
N2S |
1.6 |
NA |
-13.6 |
NA |
NA |
NA |
NA |
NA |
|||
23 |
HisF [29] |
1THF |
α/β |
TIM β/α-barrel |
253 |
253 |
7.5 |
25 |
N2S |
-3.2 |
-1.4 |
-29.7 |
NA |
NA |
NA |
NA |
NA |
|||
24 |
GFP [30] |
1B9C (4–230) |
α+β |
GFP-like |
227 |
238 |
7.5 |
25.0 |
N2S |
-2.6 |
|
0.78 |
-23.5 |
|
NA |
— |
— |
— |
−1.59 |
Although both the ACPro and our datasets adopted the data from the same reference [30], the ln(kf) value reported is different between the two datasets. Because this protein exhibited multiple parallel pathways of folding, we reported the averaged kf value obtained by the equation: kf = Σfiki where fi and ki are the fractional amplitude and the observed rate constant, respectively, of the ith pathway of folding. On the other hand, the ACPro dataset reported the value of the major pathway of folding. The ln(ku) was taken from reference [31]. |
25 |
Barnase [32] |
1BNI (3–110) |
α+β |
Microbial ribonucleases |
108 |
110 |
7.5 |
25 |
N2S |
2.7 |
NA |
-12.2 |
0.64 |
6.3 |
— |
— |
2.56 |
We have adopted the data from reference [32], which is more updated than reference [33] adopted by the AG dataset. |
||
26 |
p16INK4a [34] |
2A5E (9–156) |
α+β |
β-Hairpin-α-hairpin repeat |
148 |
148 |
7.5 |
25 |
N2S |
3.5 |
NA |
-0.22 |
0.89 |
— |
— |
— |
— |
|||
27 |
N-HypF [35] |
1GXT (4–91) |
α+β |
Ferredoxin-like |
88 |
91 |
5.5 |
28 |
N2S |
4.4 |
4.3 |
NA |
-3.9 |
-4.7 |
NA |
— |
— |
— |
— |
|
28 |
Monellin [36] |
1FA3 |
α+β |
β-β-α Zinc fingers |
96 |
97 |
7 |
25 |
N2S |
4.1 |
NA |
-10.8 |
0.89 |
NA |
NA |
NA |
NA |
Because this protein exhibited multiple parallel pathways of folding, we reported the averaged kf value obtained by the equation shown in the comment of entry number 24. |
||
29 |
T4 Lysozyme [24] |
1L63 (1–162) |
α+β |
Lysozyme-like |
162 |
164 |
6.0 |
25.0 |
N2S |
3.7 |
NA |
-12.3 |
NA |
— |
— |
— |
4.1 |
Although the ACPro and our dataset adopted the data from the same reference [24], the ln(kf) value reported is different between the two datasets. The ACPro dataset reported the value based on the 2S model, however, our value is based on N2S model. |
||
30 |
B1 domain of protein G (Streptococcal sp. group G) [37] |
1PGB |
α+β |
Ubiquitin-like |
56 |
57 |
5 |
20 |
N2S |
6.4 |
6.6 |
7.7 |
-2.0 |
-1.2 |
0.85 |
7.5 |
25 |
2S |
6.3 |
The ln(kf) value reported in the AG dataset was based on the 2S model [5]. However, the N2S nature of this protein is well established [37], so that our reported value is based on the N2S model. The value was measured in the presence of 0.4 M sodium sulfate [37]. |
31 |
p13suc1 [38] |
1PUC_mod (2–102) |
α+β |
Cell cycle regulatory proteins |
101 |
114 |
7.5 |
25 |
N2S |
4.2 |
NA |
-4.0 |
0.80 |
— |
— |
— |
— |
1PUC_mod indicates the monomeric form of p13suc1, and its coordinates were kindly given by J. Schymkowitz [39]. |
||
32 |
Ubiquitin [40] |
1UBQ |
α+β |
Ubiquitin-like |
76 |
76 |
5.0 |
25 |
N2S |
5.3 |
7.3 |
-6.7 |
0.65 |
— |
— |
2S |
7.33 |
The ln(kf) value reported in the AG dataset was based on the 2S model [5]. However, the N2S nature of this protein is well established [40], so that our reported value is based on the N2S model. Moreover, the ln(ku) was taken from the following reference [41]. |
||
33 |
Villin 14T [42] |
2VIL |
α+β |
Gelsolin-like |
126 |
126 |
5 |
37 |
N2S |
4.2 |
4.0 |
NA |
-2.4 |
-6.9 |
0.78 |
4.1 |
25 |
— |
11.9 |
We have adopted the data from reference [42], which is more updated than reference [43] adopted by the AG dataset. |
34 |
β-Lactamase (Staphylococcus aureus) [44] |
3BLM |
α+β |
β-Lactamase/transpeptidase-like |
257 |
257 |
7.0 |
25 |
N2S |
-6.6 |
NA |
-10.8 |
0.79 |
NA |
NA |
NA |
NA |
|||
35 |
ACYP (Sulfolobus solfataricus) [45] |
2BJD (12–101) |
α+β |
Ferredoxin-like |
90 |
101 |
5.5 |
37 |
N2S |
1.7 |
1.6 |
NA |
-12.0 |
-15.2 |
0.66 |
NA |
NA |
NA |
NA |
|
36 |
UCH-L3 [46] |
1UCH (5–230) |
α+β |
Cysteine proteinases |
226 |
230 |
7.6 |
25 |
N2S |
-2.6 |
NA |
-6.9 |
0.72 |
NA |
NA |
NA |
NA |
|||
37 |
Ubq-UIM [47] |
2KDI |
α+β |
NA |
114 |
114 |
7.4 |
25 |
N2S |
2.3 |
NA |
-9.9 |
NA |
NA |
NA |
NA |
NA |
|||
38 |
Frataxin (Human) [48] |
1EKG |
α+β |
N domain of copper amine oxidase-like |
119 |
130 |
7.0 |
25 |
N2S |
3.5 |
NA |
-8.9 |
0.72 |
NA |
NA |
NA |
NA |
|||
39 |
β-Lactamase (Bacillus licheniformis) [49] |
4BLM (31–291) |
α+β |
β-Lactamase/transpeptidase-like |
261 |
265 |
7.0 |
20 |
N2S |
-4.7 |
-3.9 |
NA |
-13.6 |
-9.6 |
0.73 |
— |
— |
— |
−1.24 |
Although the ACPro and our dataset adopted the data from the same reference [49], the ln(kf) value reported is different between the two datasets. The ACPro dataset reported the value based on the 2S analysis using the kinetic folding data between 0.6 and 2 M guanidinium chloride, whereas our reported value is based on the N2S analysis using the data below 0.4 M guanidinium chloride. |
40 |
CD2.d1 [50] |
1HNG (2–98) |
β |
Immunoglobulin-like β-sandwich |
97 |
98 |
7.0 |
25 |
N2S |
1.8 |
NA |
-5.7 |
NA |
— |
— |
2S |
— |
Both the AG and our datasets adopted the same reference [50]. The Garbuzynskiy and our datasets classified the folding type as the N2S type according to the reference, but the ACPro dataset classified it as the 2S type. |
||
41 |
IL-1β [51] |
1I1B (3–153) |
β |
β-Trefoil |
151 |
153 |
7.0 |
25 |
N2S |
-4.0 |
1.4 |
-11.1 |
0.93 |
— |
NA |
— |
— |
The ln(ku) was taken from reference [52]. |
||
42 |
TI I27 [53] |
1TIT |
β |
Immunoglobulin-like β-sandwich |
89 |
89 |
7.4 |
25 |
N2S |
3.6 |
NA |
-7.6 |
0.94 |
— |
— |
— |
— |
Both the AG and our datasets adopted the same reference [53]. The ACPro and our datasets classified the folding type as the N2S type according to the reference, but the Garbuzynskiy dataset classified it as the 2S type. |
||
43 |
10FNIII [54] |
1TTG |
β |
Immunoglobulin-like β-sandwich |
94 |
94 |
5.0 |
25 |
N2S |
5.5 |
NA |
-8.4 |
NA |
— |
— |
— |
— |
The ln(ku) value was obtained by extrapolation to zero denaturant by using the second-order polynomial of the denaturant (GuSCN) activity. The ln(kf) value was based on the linear extrapolation along GuHCl concentration [55]. |
||
44 |
CRABPI (Mouse) [56] |
1CBI |
β |
Lipocalins |
136 |
137 |
8.0 |
25 |
N2S |
-3.2 |
2.6 |
-9.2 |
0.72 |
NA |
NA |
NA |
NA |
|||
45 |
CRBPII (Rat) [56] |
1OPA (1–133) |
β |
Lipocalins |
133 |
134 |
8.0 |
25 |
N2S |
1.4 |
7.6 |
-6.3 |
0.79 |
NA |
NA |
NA |
NA |
|||
46 |
IFABP [57] |
1IFC |
β |
Lipocalins |
131 |
131 |
7.3 |
20 |
N2S |
4.3 |
4.7 |
7.6 |
-5.1 |
-3.1 |
0.69 |
NA |
NA |
— |
NA |
We have adopted the data from reference [57], which is more updated than reference [56] adopted by the Garbuzynskiy dataset. |
47 |
Carbonic anhydrase II (Bovine) [58] |
1V9E |
β |
Carbonic anhydrase |
259 |
260 |
8.0 |
20 |
N2S |
-4.4 |
-3.6 |
-2.4 |
-23.6 |
-19.6 |
NA |
— |
— |
— |
— |
|
48 |
CRABPII (Mouse) [59] |
2FS6 |
β |
Lipocalins |
137 |
137 |
8.0 |
25 |
N2S |
2.3 |
4.7 |
-5.7 |
0.83 |
NA |
NA |
NA |
NA |
|||
49 |
Pseudoazurin [60] |
1ADW |
β |
Cupredoxin-like |
123 |
123 |
7.0 |
15 |
N2S |
0.69 |
1.6 |
NA |
-3.5 |
0.2 |
0.90 |
— |
— |
— |
— |
The experiments were carried out in the presence of 0.5 M Na2SO4. |
50 |
SNase [61] |
2PQE |
β |
OB-fold |
149 |
149 |
6.0 |
20.0 |
N2S |
2.2 |
2.7 |
4.9 |
-6.3 |
-4.0 |
NA |
5.3 |
15 |
— |
2.34 |
We adopted reference [61], but the AG dataset adopted another reference [62]. Because this protein exhibited multiple parallel pathways of folding, we reported the averaged kf value obtained by the equation shown in the comment of entry number 24. Moreover, the ln(ku) was estimated from the chevron plot for the P47T/P117G mutant (pH 7.0 and 20°C) [63] |
51 |
BABP (Human) [64] |
5L8I (3–127) |
β |
Lipocalins |
125 |
128 |
8.0 |
25 |
N2S |
0.64 |
2.7 |
-8.6 |
0.70 |
NA |
NA |
NA |
NA |
|||
52 |
IL-33 [65] |
2KLL |
β |
NA |
160 |
160 |
6.5 |
25 |
N2S |
-1.4 |
NA |
-12.3 |
0.82 |
NA |
NA |
NA |
NA |
Description of each proteins includes: Column 1: “No”: serial number. Column 2: “Protein short name”: includes a reference to the original experimental paper on its folding kinetics. Column 3: “PDB code”: the protein three-dimensional (3D) structure code according to the PDB. If only a part of the chain was used in the protein folding experiment, it is contained in brackets. If a 3D structure composed of multiple separated chains (eg. A, B, C, and D), we considered a chain that have the highest coverage with L. If the considered chain is other than chain A, it is explicitly mentioned. Column 4: “Protein structural class” Column 5: Fold classification by SCOP (http://scop.mrc-lmb.cam.ac.uk/scop/). Column 6: “Lpdb”: number of folded residues according to the PDB data. If a 3D structure is not continuous (i.e. multiple breaks), we removed only terminal residues (N- and C- terminal) and considered the remaining region as continuous. Column 7: “L”: number of residues in the protein used in the experimental study. Column 8: Either the present dataset or our dataset, which contains nine subdivisions: 1) pH, 2) Temperature, 3) folding type, 4) the ln(kf) reported, 5) the ln(kf) value after the temperature correction, 6) the logarithmic rate constant of formation of a folding intermediate, ln(kI), when the value is available in the literature (only for N2S proteins), 7) the ln(ku) value reported, 8) the ln(ku) value after the temperature correction and 9 ) the Tanford β value (βT) Column 9: ACPro and Garbuzynskiy (AG) dataset, which contains four subdivisions: 1) pH, 2) temperature, 3) folding type and 4) ln(kf). If the values provided in the present dataset and the AG dataset were the same, they are represented as “—”. Otherwise, values reported in the AG dataset are represented normally. If the present dataset is unique or had not been previously reported previously, it is represented as “NA”. Note: if the ACPro and Garbuzynskiy sets were identical, we reported those values in a single row. Otherwise, both the ACPro values and the Garbuzynskiy values are listed in the first and second rows, respectively. Column 10: Comments, including descriptions concerning discrepancies between the present dataset and the AG dataset of specific proteins, where necessary. |
References:
|