Clairvoyant API¶
-
class
clairvoyant.Backtest(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>)[source]¶ Backtest is a type of machine learning classifier.
The purpose of
Backtestis to collect statistics on the performance of learned classifications while providing a quick and easy way to vary parameters for rapid experimentation. Backtest also provides some convenience functions for visualizing collected statistics.Parameters: - variables – A list of columns that represent learning features.
- trainStart – A datetime as a string that should be consistent with
the
tzparameter. Defines the start date for model training. - trainEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model training. - testStart – A datetime as a string that should be consistent with the
tzparameter. Defines the start date for model testing. - testEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model testing. - buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
- sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
- C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
- gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
- continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
- tz – The timezone associated with the datetime parameters. Default UTC.
Variables: debug – A boolean value that determines if debug strings will be printed as backtesting is run. Warning: may result in a lot of output.
-
nextPeriodLogic(prediction, performance, *args, **kwargs)[source]¶ Collect statistics on correct and incorrect buys and sells.
Parameters: - prediction – A 1 or -1 representing an up or down performance.
- performance – A positive or negative value representing the actual observed performance.
-
class
clairvoyant.Clair(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>)[source]¶ Cla.I.R. - Classifier Inferred Recommendations.
Clair uses the support vector machine supplied by the sk-learn library to to infer buy and sell classifications for stocks using a client-supplied feature specification. Clair uses the default Radial Basis Function kernel provided by SVC. For more details, see the scikit learn documentation.
Clients need to provide a date range for the training phase and another range for the testing phase. The learning phase determines classification probabilities that are used in the testing phase.
Once the model is reliably trained, clients may use the
predict()function to predict a result given an observed support vector.Parameters: - variables – A list of columns that represent learning features.
- trainStart – A datetime as a string that should be consistent with
the
tzparameter. Defines the start date for model training. - trainEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model training. - testStart – A datetime as a string that should be consistent with the
tzparameter. Defines the start date for model testing. - testEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model testing. - buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
- sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
- C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
- gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
- continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
- tz – The timezone associated with the datetime parameters. Default UTC.
-
execute(data, model, X=[], y=[])[source]¶ Execute the strategy logic using a trained model and input data.
Parameters: - data – A
Historyobject containing testing data. - model – A trained model.
- X – Optional preprocessed support vectors used for continued training.
- y – Optional preprocessed target values corresponding to any supplied support vectors.
- data – A
-
learn(data, X=[], y=[])[source]¶ Start the learning phase.
Parameters: - data – A
Historyobject containing stock data along with training features. - X – Optional preprocessed support vectors.
- y – Optional preprocessed target values. Should coincide with the
Xparameter.
- data – A
-
nextPeriodLogic(prediction, performance, row, attrs)[source]¶ Override this function to provide your own logic.
-
class
clairvoyant.Portfolio(variables, trainStart, trainEnd, testStart, testEnd, buyThreshold=0.65, sellThreshold=0.65, C=1, gamma=10, continuedTraining=False, tz=<UTC>, transaction_cost=9.99)[source]¶ Provides a basic portfolio framework for backtesting.
Parameters: - variables – A list of columns that represent learning features.
- trainStart – A datetime as a string that should be consistent with
the
tzparameter. Defines the start date for model training. - trainEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model training. - testStart – A datetime as a string that should be consistent with the
tzparameter. Defines the start date for model testing. - testEnd – A datetime as a string that should be consistent with the
tzparameter. Defines the end date for model testing. - buyThreshold – Defines the confidence level at which Clair will will recommend a buy. Default 0.65.
- sellThreshold – Defines the confidence level at which Clair will recommend a sell. Default 0.65.
- C – A penalty parameter for false positives. See scikit-learn documentation for more details. Default 1.
- gamma – The kernel coefficient for machine learning. See scikit-learn documentation for more details. Default 10.
- continuedTraining – Determine if data from the testing period should be used to continue training the model during the testing phase. Default False.
- tz – The timezone associated with the datetime parameters. Default UTC.
- transaction_cost – The amount deducted from balance after each trade.
Variables: - debug – A boolean value that determines if debug strings will be printed as backtesting is run. Warning: may result in a lot of output.
- startingBalance – The initial balance.
- buyingPower – Cash balance.
- shares – The number of shares of a stock.
- lastQuote – The latest available stock price.
Buy a certain number of shares.
Parameters: - shares – The number of shares to buys.
- quote – The price to buy shares at.
-
portfolioValue(row, attrs)[source]¶ Determine the value of the portfolio.
Parameters: - row – Stock data as a named tuple.
- attrs – A key map that maps common names to the named tuple keys.
-
runModel(data, startingBalance)[source]¶ Backtest the porfolio strategy.
Parameters: - data – Historical stock data.
- startingBalance – The beginning available cash balance.
Sell a certain number of shares.
Parameters: - shares – The number of shares to sell.
- quote – The price to sell at.
-
class
clairvoyant.History(data, col_map=None, tz=<UTC>, features=None)[source]¶ A wrapper for historical stock data.
You can query for a row by date:
history['2017-02-14 06:30:00'] # get data by a specific date
You can slice using datetime objects or index numbers:
history[startDate:endDate] # get data between startDate and endDate history[0:100] # get rows between 0 and 100
You can get individual records by index:
history[10] # gets a row of data
You can access a column of data by key just like a dataframe:
history['Open'] # gets a column of data history.open # or access the same data by attribute
Parameters: - data – Client stock data. Can be a string representing a csv file or it can be a pandas dataframe.
- col_map – A dict mapping your data’s column names to common names
where the common names are keys and your custom names are
values. This is an optional parameter. If
Noneis provided, History will assume client data is already formatted with common names. - tz – The timezone to associate with the datetime in data. Default is UTC time.
- features – A list of column names that informs Clair which columns can be used as learning features.
Variables: - date – Datetime series in data corresponding to the beginning of each period.
- open – Opening stock price series.
- high – Series of stock price highs.
- low – Series of stock price lows.
- close – Closing stock price series.
- volume – Series of stock price trading volume.
- return_rate – Series of percentage change calculated as a percent of opening price.