Senior Full Stack Engineer & Cryptocurrency Fanatic
My first expertise with crypto wasn’t underneath superb circumstances: a good friend who takes care of a number of servers at his job was contaminated with ransomware – and this malware was demanding he pay the ransom quantity in a cryptocurrency referred to as Monero (XMR).
After this not-so-friendly introduction, I began to review how the know-how behind cryptocurrencies works, and I fell in love with it. I used to be already within the inventory market, so I joined the acquainted (inventory market) with the novel (crypto). To check the data I’d realized from my inventory market books, I began making a easy Moving Average Convergence Divergence (MACD) crossover bot.
This labored for some time, however I rapidly realized that I ought to – and will – make the bot so much higher.
Now, the challenge that I began as a pastime has a capital administration system, a mix of technical indicators, and sentiment evaluation powered by machine studying. Between 10 March 2020 and 10 August 2020, my bot resulted in a success rate of 63.1%, profit factor of 1.74, and cumulative gross results of roughly 588% (you possibly can see a duplicate of all of my trades throughout this era in this Google Sheet report).
About my challenge
I wanted a bot that gave me a high-performance, scalable technique to calculate technical indicators and course of sentiment knowledge in real-time.
To do the whole lot I would like by way of my technical indicators calculation, I acquire candlestick chart knowledge and market depth by way of an always-up websocket connection that tracks each Bitcoin market on the Binance exchange (~215 in complete, 182 being tradeable, at this second).
The machine studying sentiment evaluation began as a easy experiment to see if exterior information affected the market. For instance: if a well-known individual within the crypto ecosystem tweeted {that a} large trade was hacked, the worth will in all probability fall and have an effect on the entire market. Likewise, superb information ought to affect the worth in a optimistic manner. I calculated sentiment evaluation scores in real-time, as quickly as new knowledge was ingested from sources like Twitter, Reddit, RSS feeds, and and many others. Then, utilizing these scores, I might decide market situations in the intervening time.
Now, I mix these two parts with a weighted common, 60% technical indicators and 40% sentiment evaluation.
Present buying and selling bot dashboard the place I monitor all my ongoing trades and outcomes; On this specific case, filtered for the interval of 10-March-2020 to 10-August-2020.
Fast breakdown of my outcomes and success charges week-over-week; On this specific case, filtered for the interval of 10-March-2020 to 10-August-2020.
Firstly, I attempted to save lots of the collected knowledge in easy information, however rapidly realized that wasn’t a great way to retailer and course of this knowledge. I began on the lookout for an alternate: a performant database.
I went by way of a number of databases, and all of them at all times lacked one thing I wound up needing to proceed my challenge. I attempted MongoDB, InfluxDB, and Druid, however none of them 100% met my wants.
Of the databases I attempted, InfluxDB was a great choice; nevertheless, each question that I attempted to run was painful, on account of their very own question language (InfluxQL). As quickly as my collection began to develop exponentially to larger ranges, the server did not have sufficient reminiscence to deal with all of them in real-time. It’s because the present InfluxDB TSM storage engine requires an increasing number of allotted reminiscence for every collection. I’ve numerous distinctive metrics, so the method ran out of obtainable reminiscence rapidly,
I deal with considerably giant quantities of knowledge daily, particularly on days with many market actions. On common, I’m ingesting round 20okay data/market, or 3.6 million complete data, per day (20okay*182 markets) and even with this big quantity of knowledge, my question response time is within the milliseconds.
That is the place TimescaleDB began to shine for me. It gave me quick real-time aggregations, built-in time-series capabilities, excessive ingestion charges – and it didn’t require elevated reminiscence utilization to do all of this.
Along with this uncooked market knowledge, a standard use case for me is to research the info in numerous time frames (e.g., 1min, 5min, 1hr, and many others.) I preserve these data in a pre-computed mixture to extend my question efficiency and permit me to make sooner selections about whether or not or to not enter a place.
For instance, right here’s a easy question that I take advantage of so much to observe the efficiency of my trades on a day by day or weekly foundation (day by day on this case):
SELECT time_group, total_trades, positive_trades,
negative_trades,
ROUND(100 * (positive_trades / total_trades), 2) AS success_rate, revenue as gross_profit,
ROUND((revenue - (total_trades * 0.15)), 2) AS net_profit
FROM (
SELECT time_bucket('1 day', buy_at::TIMESTAMP)::DATEAS time_group, COUNTAS total_trades,
SUM(CASEWHEN revenue > 0THEN1ELSE0END)::NUMERICAS positive_trades,
SUM(CASEWHEN revenue <= 0THEN1ELSE0END)::NUMERICAS negative_trades,
ROUND(SUM(revenue), 2) AS revenue
FROM commerce
GROUPBY time_group ORDERBY time_group
) T ORDERBY time_group
CREATEORREPLACEFUNCTION tr(_symbol TEXT, _till INTERVAL)
RETURNSTABLE(dateTIMESTAMPWITHOUTTIME ZONE, consequenceNUMERIC(9,8), %NUMERIC(9,8)) LANGUAGE plpgsql AS $$ DECLAREBEGINRETURNQUERYWITH candlestick AS ( SELECT * FROM candlestick c WHERE c.image = _symbol AND c.time > NOW() - _till )
SELECT d.time, (GREATEST(a, b, c)) :: NUMERIC(9,8) asconsequence, (GREATEST(a, b, c) / d.shut) :: NUMERIC(9,8) as%FROM (
SELECT at the moment.time, at the moment.shut, at the moment.excessive - at the moment.low as a,
COALESCE(ABS(at the moment.excessive - yesterday.shut), 0) b,
COALESCE(ABS(at the moment.low - yesterday.shut), 0) c FROM candlestick at the moment
LEFTJOIN LATERAL (
SELECT yesterday.shut FROM candlestick yesterday WHERE yesterday.time < at the moment.time ORDERBY yesterday.time DESCLIMIT1
) yesterday ONTRUEWHERE at the moment.time > NOW() - _till) d;
END; $$;
CREATEORREPLACEFUNCTION atr(_interval INT, _symbol TEXT, _till INTERVAL)
RETURNSTABLE(dateTIMESTAMPWITHOUTTIME ZONE, consequenceNUMERIC(9,8), %NUMERIC(9,8)) LANGUAGE plpgsql AS $$ DECLAREBEGINRETURNQUERYWITH true_range AS ( SELECT * FROM tr(_symbol, _till) )
SELECT tr.date, avg.sma consequence, avg.sma_percent %FROM true_range tr
INNERJOIN LATERAL ( SELECTavg(lat.consequence) sma, avg(lat.%) sma_percent
FROM (
SELECT * FROM true_range inr
WHERE inr.date <= tr.date
ORDERBY inr.date DESCLIMIT _interval
) lat
) avgONTRUEWHERE tr.date > NOW() - _till ORDERBY tr.date;
END; $$;
SELECT * FROM atr(14, 'BNBBTC', 'four HOURS') ORDERBYdate
My present deployment & future plans
To develop my bot and all its capabilities, I used Node.js as my primary programming language and varied libraries: cote to speak between all my modules with out overengineering, tensorFlow to coach and deploy all my machine studying fashions, and tulind for technical indicator calculation, in addition to varied others.
At the moment, I’ve a complete of 55 markets (that are re-evaluated each month, based mostly on commerce simulation efficiency) that commerce concurrently 24/7; when all my technique situations are met, a commerce is mechanically opened. The bot respects my capital administration system, which is mainly to restrict myself to 10 opened positions and solely utilizing 10% of the out there capital at a given time. To maintain monitor of the outcomes of an open commerce, I take advantage of dynamic Trailing Stop Loss and Trailing Take Profit.
The method of re-evaluating a market requires a second occasion of my bot that runs within the background and makes use of my primary technique to simulate trades in all Bitcoin markets. When it detects {that a} market is doing effectively, based mostly on the metrics I monitor, that market enters the primary bot occasion and begins reside buying and selling. The identical applies to those who are performing poorly; as quickly as the primary occasion of my bot detects issues are going badly, the market is faraway from the primary occasion and the second occasion begins monitoring it. If it improves, it is added again in.
As each developer probably is aware of all too effectively, the method of constructing a software program is to at all times enhance it. Proper now, I’m making an attempt to enhance my capital administration system utilizing Kelly Criterion.
For my use case, I’ve discovered TimescaleDB is a strong and strong selection: it’s quick with dependable ingest charges, effectively shops and compresses an enormous dataset in a manner that’s manageable and cost-effective, and offers me real-time aggregation performance.
The TimescaleDB website, core documentation, and this blog post about managing and processing big time-series datasets is all fairly straightforward to grasp and observe – and the TimescaleDB crew is responsive and useful (and so they at all times present up in neighborhood discussions, like my Reddit AMA).
It’s been straightforward and easy to scale, with out including any new applied sciences to the stack. And, as an SQL consumer, TimescaleDB provides little or no upkeep overhead, particularly in comparison with studying or sustaining a brand new database or language.
This publish is barely modified from authentic publish on the TimescaleDB weblog (learn it here).
Associated
Tags
Subscribe to get your day by day round-up of prime tech tales!