Use a database to programmatically synonymise words and phrases.
Developed from work associated with http://txt.1bpm.net/Cicerone/
- Python (Developed with 2.7)
- PostgreSQL
- Peewee for Python
The source is provided with no setup.py or other installation and the idea is
that it can be used directly or assimilated into other Python projects
directly.
- Load the gzipped sql file in the sql directory to your PostgreSQL instance.
- In the synonymiser directory, rename config.dist.py to config.py and edit the
containing values to reflect your PostgreSQL installation.
synonymiser.py can be called directly and has a number of command line options.
It can handle a single word or a phrase given as an argument, or take input
piped to stdin. Some of the following options only make sense for synonymising
a single word:
- -h , --help : show help
- -o , --offensive : include words marked as offensive in the database (default
is don't)
- -l LIMIT , --limit LIMIT : limit the number of synonyms returned to LIMIT.
Only relevant when providing a single word.
- -s SORTING , --sorting SORTING : sort the list of synonyms, available options
are random, alpha and none. Only relevant when providing a single word.
The synonymiser directory can be used as a subpackage in your Python project
or you can just use the three files config.py, synonymiser.py and db.py
accordingly. The functions intended to be exposed are:
synonymise(line, offensives=False) : synonymise a line of text, replacing
each word with randomly selected synonyms, only selecting words marked as
offensive if offensives is True. Returns a string.
get_synonyms(word, limit=1, sorting=SORTING.RANDOM, offensives=False) : get
synonyms for word, limited to the number specified by limit, where sorting can
be SORTING.RANDOM, SORTING.ALPHA or SORTING.NONE, only selecting words marked
as offensive if offensives is True. Returns a list.