Clairvoyant API

class clairvoyant.Backtest(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>)[source]

Backtest is a type of machine learning classifier.

The purpose of Backtest is to collect statistics on the performance of learned classifications while providing a quick and easy way to vary parameters for rapid experimentation. Backtest also provides some convenience functions for visualizing collected statistics.

Parameters:
  • variables – A list of columns that represent learning features.
  • trainStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model training.
  • trainEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model training.
  • testStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model testing.
  • testEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model testing.
  • buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
  • sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
  • C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
  • gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
  • continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
  • tz – The timezone associated with the datetime parameters. Default UTC.
Variables:

debug – A boolean value that determines if debug strings will be printed as backtesting is run. Warning: may result in a lot of output.

buyLogic(*args, **kwargs)[source]

Increment the buy count.

buyStats()[source]

Return the collected buy statistics.

clearStats()[source]

Reset all collected statistics.

displayConditions()[source]

Print the learning and testing parameters.

displayStats()[source]

Print the collected backtesting statistics.

nextPeriodLogic(prediction, performance, *args, **kwargs)[source]

Collect statistics on correct and incorrect buys and sells.

Parameters:
  • prediction – A 1 or -1 representing an up or down performance.
  • performance – A positive or negative value representing the actual observed performance.
runModel(data)[source]

Run backtesting.

Parameters:data – A History of stock data that includes observations in both the training and test phases.
sellLogic(*args, **kwargs)[source]

Increment the sell count.

sellStats()[source]

Return the collected sell statistics.

visualizeModel(width=5, height=5, stepsize=0.02)[source]

Output a visualization of the backtesting results.

The diagram overlays training and testing observations on top of a color coded representation of learned recommendations. The color intensity represents the distribution of probability.

class clairvoyant.Clair(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>)[source]

Cla.I.R. - Classifier Inferred Recommendations.

Clair uses the support vector machine supplied by the sk-learn library to to infer buy and sell classifications for stocks using a client-supplied feature specification. Clair uses the default Radial Basis Function kernel provided by SVC. For more details, see the scikit learn documentation.

Clients need to provide a date range for the training phase and another range for the testing phase. The learning phase determines classification probabilities that are used in the testing phase.

Once the model is reliably trained, clients may use the predict() function to predict a result given an observed support vector.

Parameters:
  • variables – A list of columns that represent learning features.
  • trainStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model training.
  • trainEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model training.
  • testStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model testing.
  • testEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model testing.
  • buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
  • sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
  • C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
  • gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
  • continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
  • tz – The timezone associated with the datetime parameters. Default UTC.
buyLogic(prob, row, attrs)[source]

Override this function to provide your own logic.

execute(data, model, X=[], y=[])[source]

Execute the strategy logic using a trained model and input data.

Parameters:
  • data – A History object containing testing data.
  • model – A trained model.
  • X – Optional preprocessed support vectors used for continued training.
  • y – Optional preprocessed target values corresponding to any supplied support vectors.
learn(data, X=[], y=[])[source]

Start the learning phase.

Parameters:
  • data – A History object containing stock data along with training features.
  • X – Optional preprocessed support vectors.
  • y – Optional preprocessed target values. Should coincide with the X parameter.
nextPeriodLogic(prediction, performance, row, attrs)[source]

Override this function to provide your own logic.

predict(model, Xs)[source]

Calculate the probability of a buy or sell classification.

Parameters:
  • model – A trained model.
  • Xs – A list containing support vector data for a single vector.
sellLogic(prob, row, attrs)[source]

Override this function to provide your own logic.

class clairvoyant.Portfolio(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>, transaction_cost=9.99)[source]

Provides a basic portfolio framework for backtesting.

Parameters:
  • variables – A list of columns that represent learning features.
  • trainStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model training.
  • trainEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model training.
  • testStart – A datetime as a string that should be consistent with the tz parameter. Defines the start date for model testing.
  • testEnd – A datetime as a string that should be consistent with the tz parameter. Defines the end date for model testing.
  • buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
  • sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
  • C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
  • gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
  • continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
  • tz – The timezone associated with the datetime parameters. Default UTC.
  • transaction_cost – The amount deducted from balance after each trade.
Variables:
  • debug – A boolean value that determines if debug strings will be printed as backtesting is run. Warning: may result in a lot of output.
  • startingBalance – The initial balance.
  • buyingPower – Cash balance.
  • shares – The number of shares of a stock.
  • lastQuote – The latest available stock price.
buyLogic(confidence, row, attrs)[source]

Decide whether or not to buy shares.

buyShares(shares, quote)[source]

Buy a certain number of shares.

Parameters:
  • shares – The number of shares to buys.
  • quote – The price to buy shares at.
clearAllRuns()[source]

Reset the portfolio.

displayAllRuns()[source]

Print trading statistics.

displayLastRun()[source]

Print results of the latest run.

nextPeriodLogic(prediction, nextPeriodPerformance, row, attrs)[source]

Record performance.

portfolioValue(row, attrs)[source]

Determine the value of the portfolio.

Parameters:
  • row – Stock data as a named tuple.
  • attrs – A key map that maps common names to the named tuple keys.
runModel(data, startingBalance)[source]

Backtest the porfolio strategy.

Parameters:
  • data – Historical stock data.
  • startingBalance – The beginning available cash balance.
sellLogic(confidence, row, attrs)[source]

Decide whether or not to sell shares.

sellShares(shares, quote)[source]

Sell a certain number of shares.

Parameters:
  • shares – The number of shares to sell.
  • quote – The price to sell at.
class clairvoyant.History(data, col_map=None, tz=<UTC>, features=None)[source]

A wrapper for historical stock data.

You can query for a row by date:

history['2017-02-14 06:30:00']  # get data by a specific date

You can slice using datetime objects or index numbers:

history[startDate:endDate]  # get data between startDate and endDate
history[0:100]              # get rows between 0 and 100

You can get individual records by index:

history[10]  # gets a row of data

You can access a column of data by key just like a dataframe:

history['Open']  # gets a column of data
history.open     # or access the same data by attribute
Parameters:
  • data – Client stock data. Can be a string representing a csv file or it can be a pandas dataframe.
  • col_map – A dict mapping your data’s column names to common names where the common names are keys and your custom names are values. This is an optional parameter. If None is provided, History will assume client data is already formatted with common names.
  • tz – The timezone to associate with the datetime in data. Default is UTC time.
  • features – A list of column names that informs Clair which columns can be used as learning features.
Variables:
  • date – Datetime series in data corresponding to the beginning of each period.
  • open – Opening stock price series.
  • high – Series of stock price highs.
  • low – Series of stock price lows.
  • close – Closing stock price series.
  • volume – Series of stock price trading volume.
  • return_rate – Series of percentage change calculated as a percent of opening price.
read_csv(*args, **kwargs)[source]

Read a csv file.

Exact same interface as pandas.read_csv.

rename(*args, **kwargs)[source]

Rename the stored dataframe columns.

Exposes the exact same interface as pandas.DataFrame.rename.