bilbo.utils package

Submodules

bilbo.utils.crf_datas module

crf data

bilbo.utils.crf_datas.apply_patterns(sections_xyseq, patterns, empty_features=False)

brief Transform a list of features given patterns

Parameters:sections_xseq – iterable : a generator on a list of sections features list and labels
Returns:a generator that yields a list new list of features given patterns
bilbo.utils.crf_datas.extract_y(sections, nfeatures=None)
Parameters:
  • sections – iterable : a sections generator (like returned by fd2sections() )
  • nfeatures – None|int : if None the first line of the first section is expected to be with a label for last feature. Else nfeatures indicate the number of features, sections[x][nfeatures] is the line’s label.
Returns:

a generator that yields one tuple(xseq, yseq) per section

bilbo.utils.crf_datas.fd2patterns(patterns_fd)

brief Read a Wapiti pattern file

Parameters:patterns_fd – iterable : a line generator
Returns:An array of tuple(name, row, col)
bilbo.utils.crf_datas.fd2sections(datas_fd, sep=None)

brief Generator that yield sections of features from a BIOS formated content coming from a line generator

Parameters:
  • datas_fd – iterable: a line generator (as returned by open())
  • sep – None|str : if None yield single string containing BIOS formated features. Else splits lines and features given sep
Returns:

Depends on bios

bilbo.utils.crf_datas.sections2evaluate(sections, prop=0.8, seed=None)

brief Split sections into a training and an evaluation part

Parameters:
  • sections – iterable: items are sections
  • prop – float : div proportions
  • seed – int | None: random seed
Returns:

split section fro train / test purposes

bilbo.utils.crf_datas.trainer_opts(name, options)

brief Return a dict of options for the trainer

Parameters:
  • name – str : can be wapiti | crfsuite
  • options – str (dict) with the option of crfsuite
Returns:

a dict

bilbo.utils.dictionaries module

dictionaries

bilbo.utils.dictionaries.compile_multiword(infile)
Parameters:infile – str
bilbo.utils.dictionaries.generatePickle(dic, infile)

Generate de pickle file

Parameters:
  • dic – dictionnarie
  • infile – str
Returns:

pickle file

bilbo.utils.svm_datas module

bilbo.utils.svm_datas.fd2features(datas_fd, to_dict=False)

Process SVM data file

Parameters:to_dict – bool : if true yield values are dict, else strings
Returns:a generator
bilbo.utils.svm_datas.fd2labeled_evaluation(datas_fd, to_dict=False, prop=0.8, seed=None)
brief Return 2 iterator on training and on evalutation datas (
same generator than fd2labeled_features
Parameters:to_dict – bool : if true return a dict else a string
Returns:tuple(train_datas, validation_datas)
bilbo.utils.svm_datas.fd2labeled_features(datas_fd, to_dict=False)
Generator comparable to fd2features but that yield a tuple
with (label, features)
Parameters:to_dict – bool: if true the features are returned as a dict else a string is yield
Returns:a generator that yield tuples
bilbo.utils.svm_datas.svmRepport(y_test, y_pred)

Print the evaluation repport given the test and prediction data

Parameters:
  • y_test – list of test label (oracle)
  • y_pred – list of predicted label (same range as test)
bilbo.utils.svm_datas.svm_opts()

Return kwargs and args for model training given argparse parsed arguments

Parameters:args – NameSpace: as returned by ArgumentParser.parse_argument()
Returns:a tuple(args, kwargs)

bilbo.utils.timer module

Timer class

class bilbo.utils.timer.Timer(name='', autostart=True)

Bases: object

Simple timer class

last
mean()
Returns:the average of recorded timers
name
reset(name=None)

Reset the timer and store ellapsed time

Parameters:name – str: new timer name. If giver stored datas are errased
start()

Starts the timer

t()
Returns:elapsed seconds since last start() call

Module contents

utils init