Core Intel Krzysztof Adamski, Krzysztof Żmij On the bank secret service
Core Intel
Krzysztof Adamski, Krzysztof Żmij
On the bank secret service
Are security breaches common?
https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/432412/bis-15-302-information_security_breaches_survey_2015-full-report.pdf
Carbanak
3
https://securelist.com/blog/research/68732/the-great-bank-robbery-the-carbanak-apt/
Core Intel is a part of ING Cyber Crime Resilience Programme to structurally improve the capabilities for the cybercrime
• prevention• detection and the • response
CoreIntel
4
• Measures against e-banking fraud, DDoS and Advanced Persistent Threats (APTs).
• Threat intelligence allow to respond to, or even prevent, a cybercrime attack • (This kind of intelligence is available via internal and external parties and
includes both open and closed communities)• Monitoring, detection and response to “spear phishing”• Detection/mitigation of infected ING systems’ • Baselining network traffic/anomaly detection• Response to incidents (knowledge, tools, IT environment)• Automated feeds, automated analysis and historical data analysis
The reasoning
5
The world is not enough
So the challenge is…
Market leaders Benelux
Growth marketsCommercial Banking
Challengers
Most of our data is within Europe
8
Market leaders Benelux
Growth marketsCommercial Banking
Challengers
but we operate globally
9
Expect the unexpected to collect all the data
10
• What kind of data do we need?• Where is our data located?• How we can potentially capture it?• What are the legal implications?
So there is a challenge to capture „all” the data
11
Core Intel architecture
So what you would like to see is…
Photo credit: edgarpierce via Foter.com / CC BY
…In fact it is slightly more complicated
All has its own purpose. Let’s see in details.
15
Photo credit: https://www.pexels.com/photo/dslr-camera-equipments-147462/
Local data collector
16
But tell how to capture that data
17
https://observer.viavisolutions.com/includes/popups/taps/tap-vs-span.php
Broker settings:Replication factor >= 3min.insync.replicas = 2unclean.leader.election.enable = falsereplica.lag.time.max.ms
Producer settings:acks = allretries = Integer.MAX_VALUEmax.block.ms = Long.MAX_VALUEblock.on.buffer.full = true
To have data in ordermax.in.flight.requests.per.connection = 1
Kafka producer configuration (as we don’t like losing data)
18
Central data collector
19
Time is crucial here
20
Photo credit: Cargo Cult via Foter.com / CC BY
But your business data more, so proceed with caution
21
Photo credit: https://www.pexels.com/photo/white-caution-cone-on-keyboard-211151/
• Network bandwidth control• quota.consumer.default• quota.producer.default
Kafka mirror maker configuration
22
Secure data:listeners=SSL://host.name:portssl.client.auth=requiredssl.keystore.locationssl.keystore.passwordssl.key.passwordssl.truststore.locationssl.truststore.password
Kafka mirror maker configuration
23
Secure data in transit
Streaming data
24
spark.yarn.maxAppAttemptsspark.yarn.am.attemptFailuresValidityIntervalspark.yarn.max.executor.failuresspark.yarn.executor.failuresValidityIntervalspark.task.maxFailuresspark.hadoop.fs.hdfs.impl.disable.cachespark.streaming.backpressure.enabled=truespark.streaming.kafka.maxRatePerPartition
Spark on yarn streaming configuration
25
In memory data grid
26
val rddFromMap = sc.fromHazelcastMap("map-name-to-be-loaded")
Let’s find something in these logs
27
Photo credit: https://www.flickr.com/photos/65363769@N08/12726065645/in/pool-555784@N20/
Matching
28
Tornado - a Python web framework and asynchronous networking library - http://www.tornadoweb.org/MessagePack – binary transport formathttp://msgpack.org/
• Automatically & continually match network logs <->threat intel • When new threat intel arrives, against full history network logs• When new network logs arrive, against full history threat intel• Alerts are shown in a hit dashboard• Dashboard is a web-based interfaces that provide flexible charts, querying,
aggregation and browsing • Quality/relevance of an alert is subject to the quality of IoC feeds and
completeness of internal log data.
Hit, alerts and dashboards
29
Be smart with your tooling
30
Photo credit https://www.flickr.com/photos/12749546@N07/
and leverage e.g. elasticsearch templates
31
Data mapping: - doc_value - fielddata - fields
Cluster settings to check:gateway.recover_after_nodesgateway.recover_after_master_nodesgateway.recover_after_data_nodesindices.recovery.max_bytes_per_secindices.breaker.total.limitindices.breaker.fielddata.limit
Elasticsearch configuration
32
For those who know how to use heavy equipment
33
Photo credit: News Collection & Public Distribution @techpearce2 via Foter.com / CC BY
Long data storage - HDFS
34
Kafka offset management
35
Core Intel allows users to perform advanced analytics on network logs using a set of powerful tools
• Spark API to write code to process large data sets on a cluster• perform complex aggregations to collect interesting statistics• run large scale clustering algorithms with Spark’s MLLib• run graph analyses on network logs using Spark’s GraphX• transform and extract data for use in another system (which are better for specific
analytics or visualization purposes)• Kafka, co you can write own Consumers and Producers to work with your data• to perform streaming analysis on your data• to implement your own alerting logic
• Toolset• Programming languages: Scala, Java, Python• IDE’s: Eclipse / Scala IDE, IPython Notebook and R Studio
Advanced analytics
36
How do we schedule the jobs
37
How to keep everything under control
38
Photo credit: https://www.flickr.com/photos/martijn141
Monitoring crucial points in your data pipeline
39
Something for smart guys
40
Photo credit: https://www.flickr.com/photos/jdhancock/5173498203/
Plenty of data to analyze
41
Upcoming challenges on the operations side
42
Shaken, not stirred?
43
44
Follow us to stay a step ahead
ING.com
YouTube.com/ING
SlideShare.net/ING@ING_News LinkedIn.com/company/ING
Flickr.com/INGGroupFacebook.com/ING