Originally published in Stocks & Commodities, February 2006 Issue.
The System Behind The System
by Merlin Jeffries, February 2006
When I was a novice system developer I would get excited by every marble-smooth equity curve I stumbled upon. Incredible backtest result in hand, I would start trading my systems with real money, only to find they were big loser in real-time. What happened?
What happened is that the systems I was finding, while excellent at predicting the past, were deficient when it came to predicting the future. This variety of trading system was almost impossible for me to distinguish from the future-predicting kind; until, that is, I started religiously following a process for developing my systems.
The Means Justify the End
In the world of system development the means justify the end. One can not judge a trading system based upon a backtest result (the end); one must instead consider how the backtest result was achieved (the means). The means, or the system behind the system, is a strategically designed process for developing trading systems, and it’s a trader’s main defense against being fooled into putting real money behind a hindsight based system.
The System Behind the System
Here I will present my system, or process, for developing mechanical trading systems. Hopefully it will give the reader a few ideas for developing their own.
First comes the often undervalued administration work. For each system, I create a “project” to organize my efforts. The project gets a name and identification tag. This comes in handy when I want to look back on my past research.
For each project, I move through the same seven steps:
Step 1 – Conceptualize
The first step is to write down the concept and initial rules for the system. For example, if I am researching a simple breakout system, I may write:
concept:
A break of the 20-day high or low signifies the start of a trend.
rules:
ENTRY LONG: buy when price breaks the highest high of 20 days
ENTRY SHORT: sell when price breaks the lowest low of 20 days
EXIT 1: stoploss
EXIT 2: profit target
EXIT 3: max hold period
Next I will identify my system’s degrees of freedom (DOFs). This will come in handy later. In this example, there are three DOFs to declare:
DOFs:
1. stoploss size
2. profit target size
3. max hold period days
I could also consider the “20 day” breakout point to be a DOF. However, in this example I will assume that 20 days is “hard coded” and can not be changed during the project. If I may want to optimize or change any certain aspect of the system over the life of the project, I will declare it as a DOF.
Step 2 – Gather Data
Here I gather the market data that will be used to code and test the system. Following the example, I may decide that my total data pool will include the following:
EURUSD / 05.1998 – 08.2005
USDJPY / 07.1999 – 08.2005
EURGBP / 05.1998 – 08.2005
AUSUSD / 05.2000 – 08.2005
I like to divide my total data pool into four categories:
1. Build Data
This is the data I will use to program/code the system. I don’t need much data for this, just enough to ensure that my code is executing trades as designed.
2. Test Data
I use this data to find my system’s most robust DOFs through chart study and optimization.
3. Walkforward Data
This data will be used to determine if my system can predict the future.
4. Unseen Data
This data offers one final, “honest” test of the system’s ability to predict the future.
How a trader divides their data is more art than science. General guidelines I use are 5% Build Data, 40% Test Data, 40% Walkforward Data, and 15% Unseen Data. I also like to spread a specific symbol (i.e. EURUSD) among at least three data categories.
Step 3 – Code
(data used: Build Data)
Next I write the programming code for the system, and I use the Build Data to check that the code is executing trades as intend.
More times than I’d like to admit there was a flaw in my early code that squandered my entire project. I’ve learned it’s better to take an hour closely inspecting trades than to waste two days researching and testing the wrong system.
Step 4 – Refine
(data used: Test Data)
Using the Test Data, I optimize my DOFs until I find a set that show desirable results. I don’t worry much about the system becoming over-optimized, because if that happens it will not survive the next three steps.
Some software applications, like TradeStation, allow traders to run automated optimizations for the DOFs. I like to lightly use this feature to get a feel for the range of DOFs that will be profitable. If I try 50 variables for each of my DOFs, and 40 yield positive results, this is a good sign that the system is robust. Traders may want to instead manually plug in variables for the DOFs until they stumble across a set that seems to work. I like to take this approach as well.
The main purpose of this step is to study the system on the Test Data and choose the set of DOFs that are the most robust. Continuing the example, suppose I choose the following set of variables:
DOFs:
1. stoploss size – 30 ticks
2. profit target size – 40 ticks
3. max hold days – 22 days
Step 5 – Test
(data used: Test Data)
This step is simple. Using the DOFs from Step 4, I run a backtest on the Test Data, then save the backtest report for reference. I then carefully study the backtest report to determine if results are good enough to be traded in real-time:
> If I would not be willing to trade the system, I loop back to Step 4 to re-refine the system.
> If I am sure I would trade the system in real-time, based on the backtest results, then it’s on to the next step.
Step 6 – Walkforward
(data used: Walkforward Data)
This is where I find out if the system has promise. Without changing the DOFs used in Step 5, I run a backtest on the Walkforward Data:
> If the resulting backtest result is not profitable, or does not have tradable attributes, I will loop back to Step 4 and come up with another set of DOFs (re-refine).
> If the resulting backtest looks tradable, I cross my fingers and move to the final step.
Step 7 – Unseen Data
(data used: Unseen Data)
The final step is to backtest the system on the Unseen Data. I like to call this step “the moment of truth” because it’s quick yet awfully significant.
> If the backtest result is negative the project is over. All the data has been spoiled and any further efforts will be tainted with hindsight. When I reach this point I usually let out a sigh and mumble something about how the life of a quant isn’t all it’s cracked up to be. Unfortunately this is the way most of my projects end.
> If the backtest result is positive, a group of angels fly in my office and circle my monitor singing “hallelujah!” By passing this step, there is a good chance that the system has the ability to predict the future, and it will most likely end up in my portfolio.
The Importance of Unseen Data
“Unseen” data is nothing more than a second set of Walkforward Data. I added this aspect to my process after becoming frustrated with the notion of having only “one shot” at getting the DOFs on the Walkforward Data right before having to abandon the system.
The Unseen Data gives considerable flexibility to my process. I’m able to refine, test, and walk the system forward a number of times without contaminating all of my data. I am always left with one final “honest” test of the system.
It’s worth noting that many of my projects never reach Step 7. It’s perfectly normal to loop through Step 4-6 without ever having a tradable result on the Walkforward Data. After five loops through Step 4-6, the Walkforward Data can become extremely contaminated with hindsight, so much so that I can no longer trust the result obtained from the data. When I reach this point I will usually end the project, never officially making it to Step 7 (although I may still run the system on the Unseen Data for kicks).
It’s a Hardnock Life
The hard-to-endure truth is that I may do 30 projects and never find a good result on the Unseen Data. But this only gives me confidence in my process. After all, finding a robust system is truly a rare event for even the smartest system developers; I would have to question any process that allowed me to trade every system I code.
The key to a great development process is that it keeps hindsight out of the final judgment call of whether or not to trade the system in real-time. The better the trader’s process, the more failures the trader will endure. This is just one more tragic condition the trader’s must burden. The brighter side of having a good development process is that the trader knows that when they go to trade a system they have a good chance of making money, and that should trump everything.
FF;_______________
About the Author:
Merlin Jeffries is a professional trader and a moderator at Forex Factory. He can be contacted at [email protected].
The System Behind The System
by Merlin Jeffries, February 2006
When I was a novice system developer I would get excited by every marble-smooth equity curve I stumbled upon. Incredible backtest result in hand, I would start trading my systems with real money, only to find they were big loser in real-time. What happened?
What happened is that the systems I was finding, while excellent at predicting the past, were deficient when it came to predicting the future. This variety of trading system was almost impossible for me to distinguish from the future-predicting kind; until, that is, I started religiously following a process for developing my systems.
The Means Justify the End
In the world of system development the means justify the end. One can not judge a trading system based upon a backtest result (the end); one must instead consider how the backtest result was achieved (the means). The means, or the system behind the system, is a strategically designed process for developing trading systems, and it’s a trader’s main defense against being fooled into putting real money behind a hindsight based system.
The System Behind the System
Here I will present my system, or process, for developing mechanical trading systems. Hopefully it will give the reader a few ideas for developing their own.
First comes the often undervalued administration work. For each system, I create a “project” to organize my efforts. The project gets a name and identification tag. This comes in handy when I want to look back on my past research.
For each project, I move through the same seven steps:
[SEE ATTACHED IMAGE]
Step 1 – Conceptualize
The first step is to write down the concept and initial rules for the system. For example, if I am researching a simple breakout system, I may write:
concept:
A break of the 20-day high or low signifies the start of a trend.
rules:
ENTRY LONG: buy when price breaks the highest high of 20 days
ENTRY SHORT: sell when price breaks the lowest low of 20 days
EXIT 1: stoploss
EXIT 2: profit target
EXIT 3: max hold period
Next I will identify my system’s degrees of freedom (DOFs). This will come in handy later. In this example, there are three DOFs to declare:
DOFs:
1. stoploss size
2. profit target size
3. max hold period days
I could also consider the “20 day” breakout point to be a DOF. However, in this example I will assume that 20 days is “hard coded” and can not be changed during the project. If I may want to optimize or change any certain aspect of the system over the life of the project, I will declare it as a DOF.
Step 2 – Gather Data
Here I gather the market data that will be used to code and test the system. Following the example, I may decide that my total data pool will include the following:
EURUSD / 05.1998 – 08.2005
USDJPY / 07.1999 – 08.2005
EURGBP / 05.1998 – 08.2005
AUSUSD / 05.2000 – 08.2005
I like to divide my total data pool into four categories:
1. Build Data
This is the data I will use to program/code the system. I don’t need much data for this, just enough to ensure that my code is executing trades as designed.
2. Test Data
I use this data to find my system’s most robust DOFs through chart study and optimization.
3. Walkforward Data
This data will be used to determine if my system can predict the future.
4. Unseen Data
This data offers one final, “honest” test of the system’s ability to predict the future.
How a trader divides their data is more art than science. General guidelines I use are 5% Build Data, 40% Test Data, 40% Walkforward Data, and 15% Unseen Data. I also like to spread a specific symbol (i.e. EURUSD) among at least three data categories.
Step 3 – Code
(data used: Build Data)
Next I write the programming code for the system, and I use the Build Data to check that the code is executing trades as intend.
More times than I’d like to admit there was a flaw in my early code that squandered my entire project. I’ve learned it’s better to take an hour closely inspecting trades than to waste two days researching and testing the wrong system.
Step 4 – Refine
(data used: Test Data)
Using the Test Data, I optimize my DOFs until I find a set that show desirable results. I don’t worry much about the system becoming over-optimized, because if that happens it will not survive the next three steps.
Some software applications, like TradeStation, allow traders to run automated optimizations for the DOFs. I like to lightly use this feature to get a feel for the range of DOFs that will be profitable. If I try 50 variables for each of my DOFs, and 40 yield positive results, this is a good sign that the system is robust. Traders may want to instead manually plug in variables for the DOFs until they stumble across a set that seems to work. I like to take this approach as well.
The main purpose of this step is to study the system on the Test Data and choose the set of DOFs that are the most robust. Continuing the example, suppose I choose the following set of variables:
DOFs:
1. stoploss size – 30 ticks
2. profit target size – 40 ticks
3. max hold days – 22 days
Step 5 – Test
(data used: Test Data)
This step is simple. Using the DOFs from Step 4, I run a backtest on the Test Data, then save the backtest report for reference. I then carefully study the backtest report to determine if results are good enough to be traded in real-time:
> If I would not be willing to trade the system, I loop back to Step 4 to re-refine the system.
> If I am sure I would trade the system in real-time, based on the backtest results, then it’s on to the next step.
Step 6 – Walkforward
(data used: Walkforward Data)
This is where I find out if the system has promise. Without changing the DOFs used in Step 5, I run a backtest on the Walkforward Data:
> If the resulting backtest result is not profitable, or does not have tradable attributes, I will loop back to Step 4 and come up with another set of DOFs (re-refine).
> If the resulting backtest looks tradable, I cross my fingers and move to the final step.
Step 7 – Unseen Data
(data used: Unseen Data)
The final step is to backtest the system on the Unseen Data. I like to call this step “the moment of truth” because it’s quick yet awfully significant.
> If the backtest result is negative the project is over. All the data has been spoiled and any further efforts will be tainted with hindsight. When I reach this point I usually let out a sigh and mumble something about how the life of a quant isn’t all it’s cracked up to be. Unfortunately this is the way most of my projects end.
> If the backtest result is positive, a group of angels fly in my office and circle my monitor singing “hallelujah!” By passing this step, there is a good chance that the system has the ability to predict the future, and it will most likely end up in my portfolio.
The Importance of Unseen Data
“Unseen” data is nothing more than a second set of Walkforward Data. I added this aspect to my process after becoming frustrated with the notion of having only “one shot” at getting the DOFs on the Walkforward Data right before having to abandon the system.
The Unseen Data gives considerable flexibility to my process. I’m able to refine, test, and walk the system forward a number of times without contaminating all of my data. I am always left with one final “honest” test of the system.
It’s worth noting that many of my projects never reach Step 7. It’s perfectly normal to loop through Step 4-6 without ever having a tradable result on the Walkforward Data. After five loops through Step 4-6, the Walkforward Data can become extremely contaminated with hindsight, so much so that I can no longer trust the result obtained from the data. When I reach this point I will usually end the project, never officially making it to Step 7 (although I may still run the system on the Unseen Data for kicks).
It’s a Hardnock Life
The hard-to-endure truth is that I may do 30 projects and never find a good result on the Unseen Data. But this only gives me confidence in my process. After all, finding a robust system is truly a rare event for even the smartest system developers; I would have to question any process that allowed me to trade every system I code.
The key to a great development process is that it keeps hindsight out of the final judgment call of whether or not to trade the system in real-time. The better the trader’s process, the more failures the trader will endure. This is just one more tragic condition the trader’s must burden. The brighter side of having a good development process is that the trader knows that when they go to trade a system they have a good chance of making money, and that should trump everything.
FF;_______________
About the Author:
Merlin Jeffries is a professional trader and a moderator at Forex Factory. He can be contacted at [email protected].
Attached Image