Non-redundant dataset
Non-redundant datasets were extracted from whole data in PyDISH by using PISCES (G. Wang and R. L. Dunbrack Jr, Nucleic Acids Res. 33, W94, 2005). The thresholds for sequence similarity were specified to 25% and 40%, and 228 and 423 PDB chains were extracted, respectively. We selected the PyDISH entries including these PDB chains and analyzed the compositions of axial ligands, protein function, and protein fold (CATH level C).