You are here: Home Research / Programming StructureFinder

StructureFinder

A program to search for crystal structures on your hard drive or file server

StructureFinder indexes all .cif and/or .res files below a certain directory(s) and makes them searchable. Additionally, it is able to search for unit cells in the Bruker APEX2/3 software database.

You may cite StructureFinder as 

D. Kratzert, I. Krossing, J. Appl. Cryst. 2019, 52, 468-471. https://doi.org/10.1107/S1600576719001638. (local copy)

 

StructureFinder download

 

Screenshots

In the main tab, you can import cif files to a database, you can save this database, you can open an existing database or open the APEX2/3 database on your computer. To search the APEX database, just click on the 'Open APEX Database' button. The program will try to connect the database on the local computer with the default username and password. If that fails, it will ask for a different.

Selecting a certain entry of the database shows the unit cell, the residuals and the asymmetric unit.

Basic search options for unit cells and text is also available. The cell search takes six parameters a, b, c, αβ, γ. The search is unsharp so 10 10 10 90 90 90 would find the same cell as 10.00 10.00 10.00 90.00 90.00 90.00.

The algorithm is a combination of a cell comparison by volume first (for speed) and subsequent matching using the Lenstra-Lenstra-Lovasz lattice basis reduction for cell reduction with a sphere search taking periodic boundary conditions into account. The lattice matching implementation was made by the http://pymatgen.org/ project.

The tolerances for the cell search are:

regular
volume: ±3 %, length: 0.06 Å, angle: 1.0°

more results option
volume: ±9 %, length:  0.2 Å, angle: 2.0°

The text search field searches in the directory, name and .res file text data. You can concatenate words with ? and *. For example foo*bar means 'foo[any text]bar'.

StructurFinder runs best with Python3.6 64bit, numpy (any version) and PyQt5.9. The WindowsXP version has limited functionality, because only Python3.4 runs on WindowsXP. The XP version is only intended to search for unit cells in the APEX database on an ancient Bruker framebuffer computer!

StructureFinder main window

Pro tip: Double click on the unit cell to copy the cell to the clip board.

 

 

The 'all cif values' tab shows all cif values available in the database. These are not necessarily all but most values from the cif file.

list of all cif values

 

The 'Advanced Search' tab allows you to search for several options at a time and also allows to exclude parameters. I will add more options in the future. I would be happy if you suggest more search options if you need more.

advanced search tab

 


Indexing cif and/or res files

The 'Import Directory' button starts indexing of cif or res files below the selected directory recoursively, deopending which file ending is enabled. It will scan all subdirectories for cif/res files as well as zip files containing cif files. The time for indexing mostly depends on the speed of your hard drive. Scanning a complete 256 GB SSD takes about 50s. An eight years old complete file server with over 100 user directories and 10.000 crystal structures can take an hour.

You should not run the import over a network connection. It will take ages!

 

Command Line file Indexer

With the strf_cmd.py script, you can index directories without a graphical user interface.

The options -d and -e can be given multiple times like -d /foo -d /bar.

The command line indexer only needs Python >= 3.4.

 

$ python3.6 strf_cmd.py 
usage: strf_cmd.py [-h] [-d "directory"] [-e "directory"] [-o "file name"]
Command line version of StructureFinder to collect cif files to a database.
StructureFinder will search for cif/res files in the given directory(s)
recursively. (Either -c, -r or both options must be active!)
optional arguments:
-h, --help show this help message and exit
-d "directory" Directory(s) where cif files are located.
-e "directory" Directory names to be excluded from the file search. Default
is: "ROOT", ".OLEX", "TMP", "TEMP", "Papierkorb",
"Recycle.Bin" Modifying -e option discards the default.
-o "file name" Name of the output database file. Default: "structuredb.sqlite"
-c Add .cif files (crystallographic information file) to the database.
-r Add SHELX .res files to the database.
--delete       Delete and do not append to previous database.

 

 

Indexing Example

Creates the file structuredb.sqlite in the current directory:

python3.6.exe ./strf_cmd.py -d D:\Github\StructureFinder -o test.sqlite -c -r --delete
collecting files below D:\Github\StructureFinder
74 files considered.
Added 262 files (258 cif, 4 res) files (212 in compressed files) to database in:
0 h, 0 m, 2.05 s
---------------------
Total 262 cif/res files in 'test.sqlite'. Duration: 0 h, 0 m, 2.81 s

 

The command line version always appends all data to an already existing database in the current working directory.

Should I add support for pdb files? Please send me an email if you think I should.

 

Database Format

The database format is just plain sqlite (http://www.sqlite.org/). You can view the database structure with the fantastic sqlitebrowser (http://sqlitebrowser.org/).

The database structure is not supposed to change, but I might add compression to the file format in the future. 

CSD search

StructureFinder  is able to search for unit cells in the CSD with the CellCheckCSD program. As soon as CellCheckCSD is installed, you can search the CSD. Double-Click on a result row to get the detailed structure page.

 

Web interface

Instead of the regular user interface, you can run StructureFinder as web service. First, create a database with strf_cmd.py. This can be automated with a cron job to do it regularly. The zip file strf_cmd_version.zip above contains all you need to start the web service or use the Git repository. The web interface needs the Gunicorn WSGI HTTP Server to be installed (via "pip install gunicorn").

Change the variables

host = "127.0.0.1"
port = "80"
dbfilename = "/path/to/database.sqlite"

in cgi_ui/cgi-bin/strf_web.py to your desired webserver adress and database path. Some operating systems do not allow users to run services at port 80. You might use port 8080 instead.

Go into the Structurefinder main directory and run

python3 cgi_ui/cgi-bin/strf_web.py

 

Be aware that running a web server has security implications. Do not expose this server to the internet unless you know what you are doing!

The web site should look like this after clicking on a table row:

strf_web.png

 

Open Database Automatically

If you want to open the same database file with the Windows version, you can add the database file as command line parameter in the start menu shortcut:

Changelog

-v47 Fixed bug with non-centrosymmetric shelx files. More compatible win32 version.

-v46 Fixed bug in cif parser that prevented reading of atoms from files written by Crystals

-v45 Fixed installation problems on some systems.

-v44 Small improvements in the cif file parser.

-v43 Added search option for R1 in advanced search. 

-v42 Fixed some more minor search bugs.

-v41 Fixed completely wrong result for "more results" option. Added distinction between unit cells
        from 
APEX and unit cells from files for optimal search thresholds. Made search behavior more intuitive.

-v40 Improved program startup.

-v39 Fixed bug that could stall the application while loading QWebEngineView.

-v38 _data name was shown incorrectly.

-v37 Fixed some minor search issues. 

-v36 Fixed .res file data appearing in compressed cif entries. Fixed certain search condition.

-v35 Improved .cif file parser

-v34 Web interface improvements.

-v33 Added search option to search for structures with only certain elements included.

-v32 Improved element search. Is is faster and more precise. (You need to rebuild your database!)

-v31 Fixed two bugs that caused StructureFinder to crash during cell search.

-v30 Added search in CellCeckCSD database. You need to install CellChekCSD in order to search CSD data.

-v29 Fixed missing shelxfile directory in 32bit installer.

-v28 Faster structure display. Some bugs fixed.

-v27 Suport "grow structure" for web and qt interface. 

 

License

StructureFinder is free software  and licensed under the beerware license.