Description

Last week I got asked by a customer what I think about the AWS Firewall and if it would be wise for him to implement it in his environment.
I browsed fast through the various tech documents about it, got initially confused and realised that this is not one of those topics that I was going to understand fast and move on.

I invested more time into it, started reading about rule groups, stateless, statefull rules and … I needed a break. There was surely no quick answer in sight.

Going into the topic even deeper I reached the proverbial French saying “je ne sais quoi”.

Would I implement this firewall in my own environment ?

Maybe…
Then again I’m a perfectionist, always afraid of failing…

I tend to find limitations, restrictions, imperfections and get easily disappointed.

If you want to learn I invite you to join me in this journey :)

What is AWS Firewall

If we look in the AWS docs we see:

AWS Network Firewall is a stateful, managed, network firewall and intrusion detection and prevention service for your virtual private cloud (VPC) that you created in Amazon Virtual Private Cloud (Amazon VPC). With Network Firewall, you can filter traffic at the perimeter of your VPC.

Sounds pretty compelling, no ?

when something is too good to be true…double check

Basic concepts

Components:

1 x FW
- deploys a leg (like the AWS PrivateLink ~ Endpoint Service) in an AZ inside a VPC
- and another leg in another AZ for redundancy
1 x FW Policy that is attached 1:1 to a FW
- the Policy Groups together the Rule Groups that you apply to the FW
- also the AWS Managed Lists
AWS Managed Lists
- malicious FQDN lists
- threat rule groups (emerging threats, botnet actions, malware, exploits)
Rule Groups
- Stateless (always first processed)
- Stateful
  - Standard FW rules
  - FQDN
  - IDS/IPS

Logic and Flow:

SUM[Rule Groups (Stateful RG + Stateless RG) ] + AWS Managed Lists --> 1 x FW Policy --> 1..n x FW

Worth mentioning that AWS FW uses Suricata for both FQDN and IDS/IPS. In IDS/IPS rules it supports the drop keyboard.

When a packet arrives to the FW:

It first goes to the Stateless Rule Groups.
If there is a match and a clear action -> processing stops.
- except if the action == FWD TO STATEFUL
If there is NO match, but a default action -> continues from there (pass/drop/FWD TO STATEFUL).
It reaches the Stateful Engine
- again goes through the rules until a match
- goes as per Priority Order if FW Policy configured in STRICT MODE for it
- groups first PASS, then DROP, then REJECT, then ALERT if FW Policy configured in DEFAULT MODE

Rule Groups

Stateless

A rule group can only contain EITHER stateful OR stateless rules.

The basic configuration for a stateless rulegroup you can find here:

What jumps into the eyes is that you can:

predefine a capacity (~ TCAM) for the Rule Group; this is setting a max resource cap on it
Actions are:
- Pass = Accept the packet, stop processing
- Drop = Drop the packet
- Forward to stateful Rule Group (next in line for processing it)
  - here you have the more fancy stuff
  - IDS/IPS
  - FQDN

At Firewall Policy (groups together Rule Groups) level there’s a default action to set for packets that did NOT match a Stateless RG.
There it can also be:

Pass
Drop
Forward to stateful rule groups

In Custom Actions you can configure CloudWatch Logging.
It will only be this as I mentioned, statistics.
Full logging only works for Stateful rules.

Stateful

This is where the strength of the AWS Firewall lies.
When processing Rule Groups, the FW first sends a packet through the Stateless RGs
and then depending on the action
(default of FW Policy if no match | stateless RG if matched) = FWD to Stateful RG
the processing will continue here.

There are 3 types of Stateful RG:

Standard (allow/deny)
FQDN (implemented via Suricata)
IDS/IPS (also via Suricata)

Stateful Rule order

This has to probably be the most confusing topic or at least it was for me when I struggled to read through the AWS docs.

Reminder: 1 x FW policy = SUM(Rule Groups)

What it does:

Strict Mode
- Rule Groups and Rules inside are processed as per Priority defined
Default Mode
- the FW puts everything together and then
- First process the Pass Rules
- Then Drop
- Then Alert (Logging)

The Default Mode at FW Policy level seems you can OVERWRITE inside RuleGroup level. Then rules inside that RG will be processed as per priority.

Only applies to Stateful Rules.

Stateful RG level:

FW policy level:

Standard

You define a rule group:

set a capacity (~TCAM) that it
add your rules (TCP/UDP/ICMP)
and an action:
- Pass
- Drop
- Alert (logging)

Rule definition

How it looks afterward:

Variables

Variables are quite cool to use in Rule Groups as they make your life easier when groupping together IPs/VMs for filtering purposes.

Here you have 2 choices:

Rule Variables They behave like Linux Environment variables.
You define a VAR_NAME and associate it with a list of CIDRs.

IPSET references Here there are 2 subtypes.

Managed Prefix Lists (Virtual Private Cloud -> Managed Prefix Lists menu to define them)
Network Firewall Resource Groups (Network Firewall -> Network Firewall Resource Groups menu)

The latter I find pretty cool.
It allows you to create a Dynamic Prefix List based on either EC2/VM attributes (TAGs) or EC2 ENIs.

FQDN

This one is pretty much self-explanatory:

list of domains to match
allow/deny
whether only for specific sources or for ALL traffic
- can define CIDRs
- no support for variables/ip sets like above

IDS/IPS

Here you define your rules in Suricata format with an action:

pass
drop
alert

You can find examples of rules that are publicly available here:
Proofpoint Emerging Threat Rules

FAQ

How do I get traffic to arrive to my FW ?

You need to add a route in your subnet pointing a destination (E-W or Egress 0/0) toward the FW VPC Endpoint.

Are the Firewall Resources unlimited ?

No. There’s some Capacity concept that reminds of TCAM for switches.
If you add an AWS managed Rule Group to your FW Policy you will see a Capacity field and how much it is being consumed.
If you add a custom Rule Group you will also have the possibility to set a limit on how much Capacity it can consume for sum[Rules(stateless|stateful)]
The Total is around 30k per Firewall.
1 x FW rule ! = 1 x Capacity
Depends on how AWS algorithm manages to group together rules when rendering them on the FW (this means you can have more rules eating just 1 x capacity unit for example).

How it works - Life of a packet

Use case:
Workloads that wants to go on the internet:

Workload sends the packet to the VPC Router
There you program a route 0/0 via the AWS FW VPC Endpoint
The Endpoint sends the traffic to the FW
The FW does the processing
Packet arrives back on the VPC Endpoint Subnet and Route Table
There it finds a 0/0 via IGW, goes out

What about those green little thingies with 2A, 2B, 2C ?

When a packet hits the FW it first goes through the Stateless Rule Groups (SUM[Stateles FW rules inside the Rule Group])
depending on the decission it is Accepted/Dropped or goes to Stateful
if going to Stateful then -> does it match SRC/DST IP/Port from TLS Inspection ?
- if yes, then do TLS man in the middle, then FWD to Stateful
- if not, FWD to Stateful
Stateful Match
- Standard Rules
- FQDN
- IDS/IPS

NAT

NAT what ?

The AWS FW cannot sadly do any sort of NAT.
This means you still need to add an IGW or a NAT GW in the mix, but on the positive side AWS has waived the pricing for the latter.
(as long as you pay the security license one for the FW).

Packet Flow Workload-Internet:

Workload sends smth toward Internet
It hits Protected Subnet RT -> finds 0/0 via VPC EP of AWS FW
AWS FW receives the traffic, takes a decission
Packet returns to VPC EP on Firewall Subnet RT -> route lookup
finds 0/0 via IGW
reaches IGW, does SNAT transparently (to 54.0.0.10), goes out

Packet Flow return traffic:

Internet destination replies
IGW receives it (dst 54.0.0.10), reverses NAT (~DNAT), now has DST = Workload = 10.0.0.10
looks up Ingress RT finds 10.0.0.0/24 via VPC EP of AWS FW
AWS FW gets the packet, allow/deny, if allow
Packet returns to VPC EP in Firewall Subnet Route Table
Finds 10.0.0.0/16 via local -> sends to Workload inside the VPC

FQDN/TLS

Here I find the AWS Website information confusing.
I wonder if it’s on purpose.

So TLS is supported….

Ingress from Internet to Workloads
E-W

NOT EGRESS like Azure FW, Aviatrix, Fortinet/Checkpoint/Palo

Weird, right ?

Then for pricing:

It does not get better… First of all support is just in Sydney and Ireland.
Then for pricing sometimes I pay an extra Advanced Inspectio/GB on top and sometimes not.

Moving onward a few implementation details and caveats:

FQDN is done by Suricata after TLS decryption
FQDN seems to look at HTTP header and SNI
TLS inspection is NEW, since March 2023 -> forgive its beta character
For it to work you cannot import your CA or use an AWS stored CA
You need individual certificates !!!!
You can use max 10 !!!!
Have more than 10 services that requires TLS inspection on Ingress from Internet or E-W -> BAD LUCK
You can have 20 TLS inspection configs, but then again max 10 certs (“cool”)

The setup is pretty easy.
Associate a TLS decrypt certificate with your new TLS Inspection Config:

You give it a name:

Define for which traffic to apply it (SRC/DST Address/Port)
Remember that this will match after the traffic went out from the Stateless Rule Groups (sum[rules])

Review and confirm:

Other things to keep in mind:

Fail-Close behavior (non-TLS traffic hitting TLS Config = DROP)
the technology will probably mature in the future as people start using it and AWS starts bringing it on par with other industry FWs

Pricing

I don’t normally put Pricing in-between but that Advanced License / GB in some regions I found super confusing so felt the need to follow-up.

Oh wait, it did not get better…

Now I have:

Regions with pricing 0 for Advanced Inspection Traffic Processing (as per the official note) -> why some so, some not ?
Regions that are not supposed to have AWS FW TLS Inspection have pricing for it -> living in the future ?
NAT GW prices are clearly waived as said before

IDS/IPS

Here you can define your own rules (good examples here Proofpoint Emerging Threat Rules)

Or import the AWS managed ones for both FQDN and IDS/IPS:

In total they are around ~ 20-22k so that still leaves enough capacity until the 30k max for your own custom rule groups (sum[rules]).

Define your own like this inside a RuleGroup of type Suricata compatible rule string:

Logging

Only for STATEFUL RULES.
Stateles can only send statistics to Cloudwatch optionally = no of packets accepted/dropped/etc.

For what statefully processed packets to alert and what exactly:

what matched and headers of the packets (if action is DROP/ALERT/REJECT)
flow logs

Example of an ALERT log:

Logging destinations:

S3 -> can query it with Athena or build Dashboards with Quicksight
CloudWatch
Kinesis Firehose -> can stream to 3rd parties

Not clear how to do a query in Athena ? -> ChatGPT really works well !! I tested it myself

AWS Firehose has various integrations and can stream the data then onward to your SIEM, to Splunk, to an ELK deployment.

FW Designs

Very good post from Evgeny here: AWS FW Designs

Distributed

You place an AWS FW Endpoint in each VPC that needs it.

BEWARE: Max 10 x AWS FW / Region

Centralized

This one uses a TGW.
You are routing your traffic between instances in different VPCs via TGW.
The TGW sends it out toward the Security/Inspection VPC.
AWS FW does its magic, then sends it back to TGW.
TGW returns it to the destination VPC (if a workload) or sends it to an Egress VPC + NAT GW (if Internet destined).

Limits

The ones that stick out:

Max 5 x FWs / Region (distributed scenario might not be enough)
20 x RuleGroups (Stateful|Stateless)
- each can contain multiple Rules so ~ OK
IP SET references just 5
30000 Capacity (~TCAM like)
Stateless logs have no logging, only statistics regarding accepted/dropped packets
max 20 TLS inspection configs but even so max 10 SSL certificates for your domains
maintaining routes to the FWs is a cumbersome task -> up to the user

Takeaways

Two level of rules (stateful/stateless)
Confusing design of rules + ordering (Default/Strict)
Grouping of resources (IPSET / ResourceGroups, nice one based on VM Tags)
TCAM allocation + Quotas / RuleGroup
Manual Routing, management overhead
NO SNAT/DNAT support (extra NAT GW needed)
TLS decryption seems more of a “beta” release
No Quic, TLS v1.3 Encrypted SNI, Encrypted Client Hello (no other vendor does it either though)
No Geo Blocking
IDS can use AWS Managed Rulesets (quite a lot of rules inside)
Flexible Logging + SIEM integrations
- just for Stateful Rules
Pricing (*no NATGW charge) but confusing information on the AWS Website

AWS Firewall: je ne sais quoi

Table of Contents

Description

What is AWS Firewall

Basic concepts

Logic and Flow:

Rule Groups

Stateless

Stateful

Stateful Rule order

Standard

Variables

FQDN

IDS/IPS

FAQ

How do I get traffic to arrive to my FW ?

Are the Firewall Resources unlimited ?

How it works - Life of a packet

NAT

FQDN/TLS

Pricing

IDS/IPS

Logging

FW Designs

Distributed

Centralized

Limits

Takeaways

Mihai Tanasescu