very specific: It means that http://slashdot.org will work, whereas http://www.slashdot.org will not. The BBC News site stores its images on the site http://newsimg.bbc.co.uk, which is why they do not appear.
Go back to the configuration file, and edit the newssites
ACL to this:
acl newssites dstdomain .bbc.co.uk .slashdot.org
Putting the period in front of the domains (and in the BBC's case, taking the news
off also) means that Squid allows any subdomain of the site to work, which is usually what you will want. If you want even more vagueness, you can just specify .com
to match *.com
addresses.
Moving on, you can also use time conditions for sites. For example, if you want to allow access to the news sites in the evenings, you can set up a time category using this line:
acl freetime time MTWHFAS 18:00-23:59
This time, the category is called freetime
and the condition is time
, which means we need to specify what time the category should contain. The seven characters following that are the days of the week: Monday, Tuesday, Wednesday, tHursday, Friday, sAturday, and sUnday. Thursday and Saturday use capital H and A so that they do not clash with Tuesday and Sunday.
With the freetime
category defined, you can change the http_access
line to include it, like this:
http_access allow newssites freetime
For Squid to allow access now, it must match both conditions — the request must be for either *.bbc.co.uk
or slashdot.org
, and during the time specified. If either condition does not match, the line is not matched and Squid continues looking for other matching rules beneath it. The times you specify here are inclusive on both sides, which means users in the freetime
category can surf from 18:00:00 until 23:59:59.
You can add as many rules as you like, although you should be careful to try to order them so that they make sense. Keep in mind that all conditions in a line must be matched for the line to be matched. Here is a more complex example:
> You want a category newssites
that contains serious websites people need for their work.
> You want a category playsites
that contains websites people do not need for their work.
> You want a category worktime
that stretches from 09:00 to 18:00.
> You want a category freetime
that stretches from 18:00 to 20:00, when the office closes.
> You want people to be able to access the news sites, but not the play sites, during working hours.
> You want people to be able to access both the news sites and the play sites during the free time hours.
To do that, you need the following rules:
acl newssites dstdomain .bbc.co.uk .slashdot.org
acl playsites dstdomain .tomshardware.com fedora.redhat.com
acl worktime time MTWHF 9:00-18:00
acl freetime time MTWHF 18:00-20:00
http_access allow newssites worktime
http_access allow newssites freetime
http_access allow playsites freetime
The letter D
is equivalent to MTWHF
in meaning 'all the days of the working week.'
Notice that there are two http_access
lines for the newssites
category: one for worktime
and one for freetime
. All the conditions must be matched for a line to be matched. The alternative would be to write this:
http_access allow newssites worktime freetime
However, if you do that and someone visits news.bbc.co.uk at 2:30 p.m. (14:30) on a Tuesday, Squid works like this:
> Is the site in the newssites
category? Yes, continue.
> Is the time within the worktime
category? Yes, continue.
> Is the time within the freetime
category? No; do not match rule, and continue searching for rules.
Two lines therefore are needed for the worktime
category.
One particularly powerful way to filter requests is with the url_regex
ACL line. This enables you to specify a regular expression that is checked against each request: If the expression matches the request, the condition matches.
For example, if you want to stop people downloading Windows executable files, you would use this line:
acl noexes url_regex -i exe$
The dollar sign means 'end of URL,' which means it would match http://www.somesite.com/virus.exe but not http://www.executable.com/innocent.html. The -i
part means 'case-insensitive,' so the rule matches .exe
, .Exe
, .EXE
, and so on. You can use the caret sign (^
) for 'start of URL.'
For example, you could stop some pornography sites by using this ACL:
acl noporn url_regex -i sex
Do not forget to run the kill -SIGHUP
command each time you make changes to Squid; otherwise, it does not reread your changes. You can have Squid check your configuration files for errors by running squid -k parse
as root. If you see no errors, it means your configuration is fine.
It is critical that you run the command kill -SIGHUP
and provide it the process ID of your Squid daemon each time you change the configuration; without this, Squid does not reread its configuration files.
Specifying Client IP Addresses
The configuration options so far have been basic, and there are many more you can use to enhance the proxying system you want.
After you are past deciding which rules work for you locally, it is time to spread them out to other machines. This is done by specifying IP ranges that should be allowed or disallowed access, and you enter these into Squid by using more ACL lines.
If you want to, you can specify all the IP addresses on your network, one per line. However, for networks of more than about 20 people or that use DHCP, that is more work than necessary. A better solution is to use classless interdomain routing (CIDR) notation, which enables you to specify addresses like this:
192.0.0.0/8
192.168.0.0/16
192.168.0.0/24
Each line has an IP address, followed by a slash and then a number. That last number defines the range of addresses you want covered and refers to the number of bits in an IP address. An IP address is a 32-bit number, but we are used to seeing it in dotted-quad notation: A.B.C.D. Each of those quads can be between 0 and 255