PPI3D Help

Query input

Three input modes are available in the PPI3D server. Single- and two-sequences search modes take protein sequences as input and search for homologous protein-protein interactions in the PDB. PDB entry search retrieves all protein-protein interactions in one PDB entry given it's ID.

In order to search for structural data on interactions of homologous proteins, you have to input one or more protein sequences.

Two methods of query sequences input are possible. You may paste FASTA-formatted protein sequences or input comma-separated list of UniProt Accession Codes (ACs) into the corresponding fields. In the latter case query sequences will be retrieved automatically from UniProt. Unique sequence identifiers are expected if FASTA-formatted input contains more than one protein sequence.

Entering a sequence into Single-sequence mode, you may search for all the interactions of a query protein and its homologs available in the structural database. Instead of a single sequence, you may enter multiple sequences. In such case the search will be performed for every input sequence.

Input sequences

In the Two-sequences search mode you have to input at least two sequences, one into the "First subunit" form and the second one into "Second subunit". Similarly, input of UniProt ACs is also possible.

In this case only protein-protein interactions in which the first and the second proteins (or their homologs) interact with each other will be searched.

You may also enter multiple sequences in each form. In such case all possible pairwise combinations will be queried. This may be useful to find out if there are proteins in two groups that have 3D structures of their complexes (complexes of homologs) solved experimentally.

Input sequences

You may choose either BLAST for finding only close homologs or PSI-BLAST if you want to also include remote homologs. The PSI-BLAST method builds a profile using clustered NCBI non-redundant protein sequence database and then uses the generated profile to search against the PPI3D webserver sequence database. The NCBI BLAST+ software [1] is used for both types of searches. By clicking on "Advanced sequence search options" you may customize your sequence search settings.

Search methods

Some searches, especially using PSI-BLAST, may take some time. Therefore you may want to either bookmark the page where results will be displayed or enter your email address in order to receive a link to the results once the job is finished. Optionally, you can specify a name for your job.

If you want to explore all binary protein-protein interactions present in one PDB entry, you may retrieve them by inputting a PDB ID into the PDB entry search form.

Input PDB ID

Analyzing the sequence search results

The results of sequence search are displayed in stepwise manner. First, only the summary of results is shown with the numbers of protein-protein interactions found for each query sequence or sequence pair. Then all homologous binary protein-protein interactions for one sequence (or sequence pair) can be listed and analyzed in the "Clustered results" table. The detailed data for every interaction interface is visualized in "Interaction details" page.

In case of single-sequence search the results are the binding sites in the homologs of query sequence which bind another proteins or peptides.

In case of two-sequences search the results are the interaction interfaces, where the first protein is a homolog of the input sequence for the first subunit, and the second protein is homolog of the second subunit.

When the job finishes, a window containing the summary of results will be displayed (Single-sequence example, Two-sequences example).

If the search was performed using multiple sequences (single-sequence mode) or two groups of multiple sequences (two-sequences mode) only those sequences or sequence pairs that produced any results will be reported.

Results summary for single-sequence query

The identified protein-protein binding sites are displayed for single-sequence queries. They are classified into protein-protein, protein-peptide, domain-domain and domain-peptide binding sites. Proteins are full chains of proteins in the PDB, domains correspond to protein domains in SCOPe classification. PDB proteins and SCOPe domains having less than 20 residues are classified as peptides.

Results summary for two-sequences query

Binary interaction interfaces are shown in case of two-sequences query. The results are classified to "protein-protein" or "domain-domain" interfaces.

By default, clustering of interactions is applied in order to reduce the redundancy of PDB data. Experimentally determined structures of interacting proteins/peptides having resolution better than 4 Å are first clustered by protein sequences and then by interface (binding site) similarity. You may choose several clustering levels:

CAD-score is used to define the similarity of interface residue contacts [2], and a modified version of this software calculates similarity of residue areas, interface areas, and binding site areas.

The binding sites are clustered according to only one protein. As a result, if there are binding sites that interact with completely different proteins or peptides, they fall into the same cluster. Alternatively, two entries may be shown for a homo-interaction interface, if the binding sites are different in the subunits.

The results of clustering are pre-calculated in the PPI3D database and therefore changes in the clustering levels take effect immediately after pressing "Apply". This is useful for selecting an optimal number of clusters for detailed analysis.

More detailed analysis of the identified protein-protein interaction data can be done in Clustered results window.

Data for the identified interactions, including the PDB entry information, BLAST E-values, SCOPe data and some of the interaction interface properties are displayed in a table. This table can be sorted by any column. Filtering of table rows can be done by entering criteria to boxes in the table header. Tables may be filtered by text, regular expressions or numerical values (>, >=, <, <=, range using "-"). Combining filters using "and" and "or" is also possible.

If you would like to analyze the results in your computer, you may download the table in tab-delimited format by clicking "Download table data". By default filters are also applied to downloaded data.

The last column of the table shows the number of members in the cluster, for which only one representative is shown. It is possible to list all the members of the cluster by clicking on this number.

If you want to see all the available details for a specific interaction, click on it's number in the first column. If you want to summarize interface data about several interactions, you may select some of table rows and click on "Summarize selected interactions".

Clustered results table

The Clustered results table can be easily filtered according to interaction types or clustering by choosing filtering settings below the table.

Clustered results table filtering

Moreover, the columns of the table can be hidden, enabling the user to focus on the most important data. Note that some columns (like protein IDs or alignment coverage) are hidden in the default view.

Clustered results table columns

In some cases the clustering of protein interaction interfaces and binding sites does not group all similar protein interactions into the same cluster. This can happen, for example, if the sequences of proteins differ too much. Therefore sometimes it is necessary to visually compare the results from several clusters.

If you select several protein interactions in the "Clustered results" window and click on the "Summarize selected interactions" button, the "Aligned interactions" window opens in a new tab.

Clustered results table with selected rows

First, the table with sequences of the result proteins aligned to the query sequence are shown. The residues that are part of the interaction interface are highlighted in red. The residues of the result proteins that are missing in the BLAST or PSI-BLAST alignment are displayed as spaces, and the residues that are aligned to gaps in the query sequence are omitted from this multiple sequence alignment.

Multiple sequence alignment

You may select two or more protein interactions for the structural alignment.

The structures of binary protein interactions are then aligned chain by chain using TMalign.

In case of two-sequences query, two JSmol windows are shown, displaying the interactions aligned according to the first and the second chains. For a single-sequence query only one alignment is shown as the second chains may be very different.

Aligned interaction structures

The structures may be hidden in the JSmol by deselecting the checkboxes. Interaction details can be viewed by clicking the "details" link, and after clicking on the "download" link the aligned structures may be downloaded for more detailed analysis.

The Interaction details window is divided into three main sections.

The first section shows PDB and SCOPe data for the protein-protein interaction.

The "Structural properties" section lists calculated properties of the interaction interface. In addition it provides a possibility to explore the structures of both interacting domain pair and the entire biological assembly, in which interaction takes place, using JSmol. The interface is shown in different colors, and it's residues that form hydrogen bonds, salt bridges or disulphide bonds can be visualized by clicking JSmol buttons. Coordinates of both structures can be downloaded. You can also create a homology model for your sequences using the structure of interacting domain pair as a template. The model is constructed with MODELLER [3] and includes only the sequence regions aligned with the template (see "Sequence alignments" below). All the modeling data (the input alignment, modeling script, the model and other data) are available for downloading.

Structural properties

Visualization of the interface structure can be also downloaded as PyMOL script. Download the script file and open it with PyMOL. The functions that are available in JSmol are also implemented in this script. They are accessible by clicking buttons in the PyMOL's Control Panel or from command-line.

PyMOL

The bottom of the Structural properties section contains tables with the information about individual interface residues. In order to have more concise view, this part is hidden when the page is loaded.

Numberings of the residues in the 3D structure and in the sequence alignment are given in separate columns as they often differ. "Buried ASA, Å2" is the solvent-accessible surface area (ASA) of the residue that is buried upon the interface formation from unbound subunits. "Buried ASA, %" indicates which percentage of the residue ASA in the unbound subunit becomes buried.

Interface residues table

The inter-residue contacts across the interaction interface are listed in a separate table. Below this table there is a button that sends the structure to a separate server dedicated to more detailed contact analysis.

Interface contacts table

If the interaction interface contains any small-molecule ligands, they are also listed in a table. Clicking on the name of the ligand molecule opens it's detailed information available at the RCSB PDB website.

Interface ligands table

Sorting, filtering and downloading is possible for all tables as described above in the "Clustered results window" section.

The "Sequence alignments" section relates your query sequences to experimental structure by showing corresponding amino acid residues. Sequence conservation parameters are displayed for both entire alignment and interaction interface residues. The residues that are at the interface are highlighted. Additionally, secondary structure of the result protein is displayed (H - helix, E - strand). The sequence alignment can be downloaded in FASTA format.

Sequence alignment

Hiding of some parts of "Structural properties" section and the "Sequence alignments" sections is possible by clicking on their headers.

A sequence search log page is available for every job. In this page, the sequence search settings and other job details are displayed. In addition, the numbers of found protein structures and detected protein-protein interactions for these structures are displayed for every query sequence. Examining these tables may be especially useful in the cases when the server produces no results.

Search log

Analyzing the PDB entry search results

The PDB entry search produces almost the same output as the sequence search, except that analysis of sequence alignments and modeling of protein structures are not possible, as no protein sequence is given. Other properties (clustering of interactions, listing their properties and detailed analysis of structures, residues and inter-residue contacts at the interaction interface) work in the same way as described above.

Troubleshooting

If you experience any problems using the PPI3D webserver or have any suggestions how it could be improved, you may contact us by email ppi3d (at) ibt (dot) lt.

Citing the PPI3D web server

If the PPI3D software is useful for your research, please cite the following article:

Dapkūnas J, Timinskas A, Olechnovič K, Margelevičius M, Dičiūnas R, Venclovas Č. The PPI3D web server for searching, analyzing and modeling protein-protein interactions in the context of 3D structures. Bioinformatics, in press. DOI: 10.1093/bioinformatics/btw756

References

  1. Camacho et al., BLAST+: architecture and applications, BMC Bioinformatics, 2009, 10:421.
  2. Olechnovic et al., CAD-score: a new contact area difference-based function for evaluation of protein structural models, Proteins, 2013, 81:149.
  3. Sali and Blundell, Comparative protein modelling by satisfaction of spatial restraints, J Mol Biol, 1993, 234:779.