Loop Test Sets

We obtained a non-homologous set of protein structures from the PDB (< 20% sequence identity) that were determined by X-ray crystallography at a resolution of 1.8 Ĺ or better from http://www.fccc.edu/research/labs/dunbrack/pisces/culledpdb.html. Secondary structural elements were identified using the DSSP program (Kabsch and Sander, 1983). Those segments connecting two secondary structural elements were defined as loops. Loop test sets, each containing 50 loops of the same length, length ranging from 1 to 15 residues, were extracted. 

These test sets can be dowloaded here. Each file listed in the column 'Parameterization test set' of the table below, contains 50 loops of the same length. 

In our paper [2], we made structure predictions for the loop test sets from [1]. They were slightly changed for technical reasons (please see the paper [2] for details). Our versions of the test sets can be downloaded in the table under 'Fiser test set' below. The original test sets are available at http://www.salilab.org. They included 40 loops each, some of which we had to omit in our tests. If this was the case, the remaining number of loops is given in parentheses in the table.

The last column of the table contains an additional set of 14 loops with different lengths. These have already been used by several authors. We have submitted these loops to a number of prediction web servers and listed the results, including previously published numbers, here.  

 Loop Test Sets

Length

 Parameterization test set

  Fiser test set

 Additional test set

 (14 loops of different lengths)

1

 param_testset_1.txt

 Sali_loops_1.txt

 mixed_testset.txt

2

 param_testset_2.txt

 Sali_loops_2.txt

3

 param_testset_3.txt

 Sali_loops_3.txt

4

 param_testset_4.txt

 Sali_loops_4.txt (39)

5

 param_testset_5.txt

 Sali_loops_5.txt

6

 param_testset_6.txt

 Sali_loops_6.txt (39)

7

 param_testset_7.txt

 Sali_loops_7.txt (39)

8

 param_testset_8.txt

 Sali_loops_8.txt

9

 param_testset_9.txt

 Sali_loops_9.txt

10

 param_testset_10.txt

 Sali_loops_10.txt

11

 param_testset_11.txt

 Sali_loops_11.txt

12

 param_testset_12.txt

 Sali_loops_12.txt (39)

13

 param_testset_13.txt

 Sali_loops_13.txt

14

 param_testset_14.txt

 Sali_loops_14.txt  (38)

15

 param_testset_15.txt

           ---

In these files, one loop is listed per line. Each line includes:  PDB code,  PDB number of starting residue,  PDB number of last loop residue, amino acid sequence.

 

References:

[1] A. Fiser, R. Kinh Gian Do and A. Šali. Modeling of loops in protein structures. Protein Science 2000, 9, 1753-1773.

[2] E. Michalsky, A. Goede and R. Preissner. Loops In Proteins (LIP) - a comprehensive loop database for homology modelling. Protein Engineering 2003, 16, 1-7.