kripodb.db

Fragments and fingerprints sqlite based data storage.

Registers BitMap and molblockgz data types in sqlite.

class kripodb.db.FastInserter(cursor)[source]

Use with to make inserting faster, but less safe

By setting journal mode to WAL and turn synchronous off.

Parameters:cursor (sqlite3.Cursor) – Sqlite cursor

Examples

>>> with FastInserter(cursor):
        cursor.executemany('INSERT INTO table VALUES (?), rows))
class kripodb.db.FingerprintsDb(filename)[source]

Fingerprints database

as_dict(number_of_bits=None)[source]

Returns a dict-like object to query and alter fingerprints db

Parameters:number_of_bits (Optional[int]) – Number of bits that all fingerprints have
Returns:BitMapDict
create_tables()[source]

Abstract method which is called after connecting to database so tables can be created.

Use CREATE TABLE IF NOT EXISTS … in method to prevent duplicate create errors.

class kripodb.db.FragmentsDb(filename)[source]

Fragments database

add_fragment(frag_id, pdb_code, prot_chain, het_code, frag_nr, atom_codes, hash_code, het_chain, het_seq_nr, nr_r_groups)[source]

Add fragment to database

Parameters:
  • frag_id (str) – Fragment identifier
  • pdb_code (str) – Protein databank identifier
  • prot_chain (str) – Major chain of pdb on which pharmacophore is based
  • het_code (str) – Ligand/Hetero code
  • frag_nr (int) – Fragment number, whole ligand has number 1, fragments are >1
  • atom_codes (str) – Comma separated list of HETATOM atom names which make up the fragment (hydrogens are excluded)
  • hash_code (str) – Unique identifier for fragment
  • het_chain (str) – Chain ligand is part of
  • het_seq_nr (int) – Residue sequence number of ligand the fragment is a part of
  • nr_r_groups (int) – Number of R groups in fragment
add_fragments_from_shelve(myshelve, skipdups=False)[source]

Adds fragments from shelve to fragments table.

Also creates index on pdb_code column.

Parameters:
  • myshelve (Dict[Fragment]) – Dictionary with fragment identifier as key and fragment as value.
  • skipdups (bool) – Skip duplicates, instead of dieing one first duplicate
add_molecule(mol)[source]

Adds molecule to molecules table

Uses the name of the molecule as the primary key.

Parameters:mol (rdkit.Chem.AllChem.Mol) – the rdkit molecule
add_molecules(mols)[source]

Adds molecules to to molecules table.

Parameters:mols (list[rdkit.Chem.Mol]) – List of molecules
add_pdbs(pdbs)[source]

Adds pdb meta data to to pdbs table.

Parameters:pdbs (Iterable[Dict]) – List of pdb meta data
by_pdb_code(pdb_code)[source]

Retrieve fragments which are part of a PDB structure.

Parameters:pdb_code (str) – PDB code
Returns:List of fragments
Return type:List[Fragment]
Raises:LookupError – When pdb_code could not be found
create_tables()[source]

Create tables if they don’t exist

id2label()[source]

Lookup table of fragments from an number to a label.

Returns:SqliteDict
is_ligand_stored(pdb_code, het_code)[source]

Check whether ligand is already in database

Parameters:
  • pdb_code (str) – Protein databank identifier
  • het_code (str) – Ligand/hetero identifier
Returns:

bool

label2id()[source]

Lookup table of fragments from an label to a number.

Returns:SqliteDict
class kripodb.db.IntbitsetDict(db, number_of_bits=None)[source]

Dictionary of BitMaps with sqlite3 backend.

Parameters:
number_of_bits

int – Number of bits the bitsets consist of

update([E, ]**F) → None. Update D from mapping/iterable E and F.[source]

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

class kripodb.db.SqliteDb(filename)[source]

Wrapper around a sqlite database connection

Database is created if it does not exist.

Parameters:filename (str) – Sqlite filename
connection

sqlite3.Connection – Sqlite connection

cursor

sqlite3.Cursor – Sqlite cursor

close()[source]

Close database

commit()[source]

Commit pending changes

create_tables()[source]

Abstract method which is called after connecting to database so tables can be created.

Use CREATE TABLE IF NOT EXISTS … in method to prevent duplicate create errors.

class kripodb.db.SqliteDict(connection, table_name, key_column, value_column)[source]

Dict-like object of 2 columns of a sqlite table.

Can be used to query and alter the table.

Parameters:
  • connection (sqlite3.Connection) – Sqlite connection
  • table_name (str) – Table name
  • key_column (str) – Column name used as key
  • value_column (str) – Column name used as value
connection

sqlite3.Connection – Sqlite connection

cursor

sqlite3.Cursor – Sqlite cursor

items() → list of D's (key, value) pairs, as 2-tuples[source]
iteritems() → an iterator over the (key, value) items of D[source]
iteritems_startswith(prefix)[source]

item iterator over keys with prefix

Parameters:prefix (str) – Prefix of key

Examples

All items with key starting with letter ‘a’ are returned.

>>> for frag_id, fragment in fragments.iteritems_startswith('a'):
        # do something with frag_id and fragment
Returns:List[Tuple[key, value]]
itervalues() → an iterator over the values of D[source]
materialize()[source]

Fetches all kev/value pairs from the sqlite database.

Useful when dictionary is iterated multiple times and the cost of fetching is to high.

Returns:Dictionary with all kev/value pairs
Return type:Dict
values() → list of D's values[source]
kripodb.db.adapt_BitMap(ibs)[source]

Convert BitMap to it’s serialized format

Parameters:ibs (BitMap) – bitset

Examples

Serialize BitMap

>>> adapt_BitMap(BitMap([1, 2, 3, 4]))
'xœ“c@ð'
Returns:serialized BitMap
Return type:str
kripodb.db.adapt_molblockgz(mol)[source]

Convert RDKit molecule to compressed molblock

Parameters:mol (rdkit.Chem.Mol) – molecule
Returns:Compressed molblock
Return type:str
kripodb.db.convert_BitMap(s)[source]

Convert serialized BitMap to BitMap

Parameters:s (str) – serialized BitMap

Examples

Deserialize BitMap

>>> ibs = convert_BitMap('xœ“c@ð')
BitMap([1, 2, 3, 4])
Returns:bitset
Return type:BitMap
kripodb.db.convert_molblockgz(molgz)[source]

Convert compressed molblock to RDKit molecule

Parameters:molgz – (str) zlib compressed molblock
Returns:molecule
Return type:rdkit.Chem.Mol