Sunday, December 13, 2009

1. What is a SAN?



1. What is a SAN?
A SAN, or storage area network, is a dedicated network that is separate from LANs
and WANs. It generally serves to interconnect the storage-related resources that are
connected to one or more servers. It is often characterized by its high interconnection
data rates (Gigabits/sec) between member storage peripherals and by its highly
scalable architecture. Though typically spoken of in terms of hardware, SANs very
often include specialized software for their management, monitoring and
configuration.
SANs can provide many benefits. Centralizing data storage operations and their
management is certainly one of the chief reasons that SANs are being specified and
developed today. Administrating all the storage resources in high-growth and
mission-critical environments can be daunting and very expensive. SANs can
dramatically reduce the management costs and complexity of these environments
while providing significant technical advantages.
SANs can be based upon several different types of high-speed interfaces. In fact,
many SANs today use a combination of different interfaces. Currently, Fibre Channel
serves as the de facto standard being used in most SANs. Fibre Channel is an
industry-standard interconnect and high-performance serial I/O protocol that is media
independent and supports simultaneous transfer of many different protocols.
Additionally, SCSI interfaces are frequently used as sub-interfaces between internal
components of SAN members, such as between raw storage disks and a RAID
controller.
MSKL SAN Tutorial
Provding large increases in storage performance, state-of-the-art reliability and
scalability are primary SAN benefits. Storage performance of a SAN can be much
higher than traditional direct attached storage, largely because of the very high data
transfer rates of the electrical interfaces used to connect devices in a SAN (such as
Fibre Channel). Additionally, performance gains can come from opportunities
provided by a SAN’s flexible architecture, such as load balancing and LAN-free
backup. Even storage reliability can be greatly enhanced by special features made
possible within a SAN. Options like redundant I/O paths, server clustering, and runtime
data replication (local and/or remote) can ensure data and application
availability. Adding storage capacity and other storage resources can be
accomplished easily within a SAN, often without the need to shut down or even
quiese the server(s) or their client networks. These features can quickly add up to
large cost savings, fewer network outages, painless storage expansion, and reduced
network loading.
By providing these dedicated and “very high speed” networks for storage and backup
operations. SANs can quickly justify their implementation. Offloading tasks, such as
backup, from LANs and WANs is vital in today’s IT environments where networks
loads and bandwidth availability are critical metrics by which organizations measure
their own performance and even profits. Backup windows have shrunken dramatically
and some environments have no backup windows at all since entire data networks
and applications often require 24x365 availability.
As with many IT technologies, SANs depend on new and developing standards to
ensure seamless interoperability between their member components. SAN hardware
components such as Fibre Channel hubs, switches, host bus adapters, bridges and
RAID storage systems rely on many adopted standards for their connectivity. SAN
software, every bit as important its hardware, often provides many of the features and
benefits that SANs have come to be known for. SAN software can provide or enable
foundation features and capabilities, including:
· SAN Management
· SAN Monitoring (including “phone home” notification features)
· SAN Configuration
· Redundant I/O Path Management
· LUN Masking and Assignment
· Serverless Backup
· Data Replication (both local and remote)
· Shared Storage (including support for heterogeneous platform environments)

to be cntnd..

Wednesday, July 15, 2009

Apache Performance Tuning

Apache Performance Tuning


General

RAM

The single biggest issue affecting webserver performance is RAM. Have as much RAM as your hardware, OS, and funds allow [within reason].

The more RAM your system has, the more processes [and threads] Apache can allocate and use; which directly translates into the amount of concurrent requests/clients Apache can serve.

Generally speaking, disk I/O is usually a close 2nd, followed by CPU speed and network link. Note that a single PII 400 Mhz with 128-256 Megs of RAM can saturate a T3 (45 Mbps) line.

Select MPM

Chose the right MPM for the right job:

prefork [default MPM for Apache 2.0 and 1.3]:
  • Apache 1.3-based.
  • Multiple processes, 1 thread per process, processes handle requests.
  • Used for security and stability.
  • Has higher memory consumption and lower performance over the newer Apache 2.0-based threaded MPMs.
worker:
  • Apache 2.0-based.
  • Multiple processes, many threads per process, threads handle requests.
  • Used for lower memory consumption and higher performance.
  • Does not provide the same level of isolation request-to-request, as a process-based MPM does.
winnt:
  • The only MPM choice under Windows.
  • 1 parent process, exactly 1 child process with many threads, threads handle requests.
  • Best solution under Windows, as on this platform, threads are always "cheaper" to use over processes.

Configure MPM

Core Features and Multi-Processing Modules
Default Configuration

StartServers 8
MinSpareServers 5
MaxSpareServers 20
MaxClients 150
MaxRequestsPerChild 1000



StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0



ThreadsPerChild 250
MaxRequestsPerChild 0

Directives
MaxClients, for prefork MPM

MaxClients sets a limit on the number of simultaneous connections/requests that will be served.

I consider this directive to be the critical factor to a well functioning server. Set this number too low and resources will go to waste. Set this number too high and an influx of connections will bring the server to a stand still. Set this number just right and your server will fully utilize the available resources.

An approximation of this number should be derived by dividing the amount of system memory (physical RAM) available by the maximum size of an apache/httpd process; with a generous amount spared for all other processes.

MaxClients ≈ (RAM - size_all_other_processes)/(size_apache_process)

Use 'ps -ylC httpd --sort:rss' to find process size. Divide number by 1024 to get megabytes. Also try 'top'.

Use 'free -m' for a general overview. The key figure to look at is the buffers/cache used value.

Use 'vmstat 2 5' to display the number of runnable, blocked, and waiting processes; and swap in and swap out.

Example:

  • System: VPS (Virtual Private Server), CentOS 4.4, with 128MB RAM
  • Apache: v2.0, mpm_prefork, mod_php, mod_rewrite, mod_ssl, and other modules
  • Other Services: MySQL, Bind, SendMail
  • Reported System Memory: 120MB
  • Reported httpd process size: 7-13MB
  • Assumed memory available to Apache: 90MB

Optimal settings:

  • StartServers 5
  • MinSpareServers 5
  • MaxSpareServers 10
  • ServerLimit 15
  • MaxClients 15
  • MaxRequestsPerChild 2000

With the above configuration, we start with 5-10 processes and set a top limit of 15. Anything above this number will cause serious swapping and thrashing under a load; due to the low amount of RAM available to the [virtual] Server. With a dedicated Server, the default values [ServerLimit 256] will work with 1-2GB of RAM.

When calculating MaxClients, take into consideration that the reported size of a process and the effective size are two different values. In this setup, it might be safe to use 20 or more workers... Play with different values and check your system stats.

Note that when more connections are attempted than there are workers, the connections are placed into a queue. The default queue size value is 511 and can be adjusted with the ListenBackLog directive.

ThreadsPerChild, for winnt MPM

On the Windows side, the only useful directive is ThreadsPerChild, which is usually set to a value of 250 [defaults to 64 without a value]. If you expect more, or less, concurrent connections/requests, set this directive appropriately. Check process size with Task Manager, under different values and server load.

MaxRequestsPerChild

Directive MaxRequestsPerChild is used to recycle processes. When this directive is set to 0, an unlimited amount of requests are allowed per process.

While some might argue that this increases server performance by not burdening Apache with having to destroy and create new processes, there is the other side to the argument...

Setting this value to the amount of requests that a website generates per day, divided by the number of processes, will have the benefit of keeping memory leaks and process bloat to a minimum [both of which are a common problem]. The goal here is to recycle each process once per day, as apache threads gradually increase their memory allocation as they run.

Note that under the winnt MPM model, recycling the only request serving process that Apache contains, can present a problem for some sites with constant and heavy traffic.

Requests vs. Client Connections

On any given connection, to load a page, a client may request many URLs: page, site css files, javascript files, image files, etc.

Multiple requests from one client in rapid succession can have the same effect on a Server as "concurrent" connections [threaded MPMs and directive KeepAlive taken into consideration]. If a particular website requires 10 requests per page, 10 concurrent clients will require MPM settings that are geared more towards 20-70 clients. This issue manifests itself most under a process-based MPM [prefork].

Separate Static and Dynamic Content

Use separate servers for static and dynamic content. Apache processes serving dynamic content will carry overhead and swell to the size of the content being served, never decreasing in size. Each process will incur the size of any loaded PHP or Perl libraries. A 6MB-30MB process size [or 10% of server's memory] is not unusual, and becomes a waist of resources for serving static content.

For a more efficient use of system memory, either use mod_proxy to pass specific requests onto another Apache Server, or use a lightweight server to handle static requests:

  • lighttpd [has experimental win32 builds]
  • tux [patched into RedHat, runs inside the Linux kernel and is at the top of the charts in performance]

The Server handling the static content goes up front.

Note that configuration settings will be quite different between a dynamic content Server and a static content Server.

mod_deflate

Reduce bandwidth by 75% and improve response time by using mod_deflate.

LoadModule deflate_module modules/mod_deflate.so

AddOutputFilterByType DEFLATE text/html text/plain text/css text/xml application/x-javascript

Loaded Modules

Reduce memory footprint by loading only the required modules.

Some also advise to statically compile in the needed modules, over building DSOs (Dynamic Shared Objects). Very bad advice. You will need to manually rebuild Apache every time a new version or security advisory for a module is put out, creating more work, more build related headaches, and more downtime.

mod_expires

Include mod_expires for the ability to set expiration dates for specific content; utilizing the 'If-Modified-Since' header cache control sent by the user's browser/proxy. Will save bandwidth and drastically speed up your site for [repeat] visitors.

Note that this can also be implemented with mod_headers.

KeepAlive

Enable HTTP persistent connections to improve latency times and reduce server load significantly [25% of original load is not uncommon].

prefork MPM:

KeepAlive On
KeepAliveTimeout 2
MaxKeepAliveRequests 80

worker and winnt MPMs:

KeepAlive On
KeepAliveTimeout 15
MaxKeepAliveRequests 80

With the prefork MPM, it is recommended to set 'KeepAlive' to 'Off'. Otherwise, a client will tie up an entire process for that span of time. Though in my experience, it is more useful to simply set the 'KeepAliveTimeout' value to something very low [2 seconds seems to be the ideal value]. This is not a problem with the worker MPM [thread-based], or under Windows [which only has the thread-based winnt MPM].

With the worker and winnt MPMs, the default 15 second timeout is setup to keep the connection open for the next page request; to better handle a client going from link to link. Check logs to see how long a client remains on each page before moving on to another link. Set value appropriately [do not set higher than 60 seconds].

SymLinks

Make sure 'Options +FollowSymLinks -SymLinksIfOwnerMatch' is set for all directories. Otherwise, Apache will issue an extra system call per filename component to substantiate that the filename is NOT a symlink; and more system calls to match an owner.


Options FollowSymLinks

AllowOverride

Set a default 'AllowOverride None' for your filesystem. Otherwise, for a given URL to path translation, Apache will attempt to detect an .htaccess file under every directory level of the given path.


AllowOverride None

ExtendedStatus

If mod_status is included, make sure that directive 'ExtendedStatus' is set to 'Off'. Otherwise, Apache will issue several extra time-related system calls on every request made.

ExtendedStatus Off

Timeout

Lower the amount of time the server will wait before failing a request.

Timeout 45

Other/Specific

Cache all PHP pages, using Squid, and/or a PHP Accelerator and Encoder application, such as APC. Also take a look at mod_cache under Apache 2.2.

Convert/pre-render all PHP pages that do not change request-to-request, to static HTML pages. Use 'wget' or 'HTTrack' to crawl your site and perform this task automatically.

Pre-compress content and pre-generate headers for static pages; send-as-is using mod_asis. Can use 'wget' or 'HTTrack' for this task. Make sure to set zlib Compression Level to a high value (6-9). This will take a considerable amount of load off the server.

Use output buffering under PHP to generate output and serve requests without pauses.

Avoid content negotiation for faster response times.

Make sure log files are being rotated. Apache will not handle large (2gb+) files very well.

Gain a significant performance improvement by using SSL session cache.

Outsource your images to Amazon's Simple Storage Service (S3).

Measuring Web Server Performance

Load Testing

Apache HTTP server benchmarking tool
httperf
The Grinder, a Java Load Testing Framework

Benchmarks

I have searched extensively for Apache, lighttpd, tux, and other webserver benchmarks. Sadly, just about every single benchmark I could locate appeared to have been performed completely without thought, or with great bias.

Do not trust any posted benchmarks, especially ones done with the 'ab' tool.

The only way to get a valid report is to perform the benchmark yourself.

For valid results, note to test under a system with limited resources, and maximum resources. But most importantly, configure each httpd server application for the specific situation.

Saturday, June 27, 2009

DAG repo to yum

You can make yum more robust by adding more repositories like DAG, UPDATE and RPMforge. For adding extra repositories to yum, please do the following.

cd /etc/yum.repos.d
vi dag.repo // the add the following lines in that file//

[dag]
name=Dag RPM Repository for Red Hat Enterprise Linux
baseurl=http://apt.sw.be/redhat/el$releasever/en/$basearch/dag
gpgcheck=1
rpm --import http://dag.wieers.com/rpm/packages/RPM-GPG-KEY.dag.txt


After this save this file and run the following command

yum check-update


Now yum will be having more repositories.

Thursday, June 18, 2009

MySQL Tweak[core level]

A my.cnf values run on a dual xeon with 2 GB's of ram, this is a shared hosting machine that runs MySQL and web, so all memory is not allocated to MySQL.
------------------------------------------------
/etc/my.cnf

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
skip-locking
skip-innodb
query_cache_limit=1M
query_cache_size=32M
query_cache_type=1
max_connections=900
interactive_timeout=100
wait_timeout=100
connect_timeout=10
thread_cache_size=128
#key_buffer=16M
key_buffer=200M
join_buffer=1M
max_allowed_packet=16M
table_cache=1536
sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=1M
max_connect_errors=10
# Try number of CPU's*2 for thread_concurrency
thread_concurrency=4
myisam_sort_buffer_size=64M
#log-bin
server-id=1

Query caching was added as of MySQL version 4, the following three directives will greatly enhance mysql server performance.

query_cache_limit=1M
query_cache_size=32M
query_cache_type=1

Query caching is a server wide variable, so set these generous. I have found the above levels are generally best if you server has at least 512 ram. If you run a server just for DBs with a lot of ram, you can up these quite a bit, like 2m limit and a 64+M cache size.

The key buffer is a variable that is shared amongst all MySQL clients on the server. A large setting is recomended, particularly helpful with tables that have unique keys. (Most do)

key_buffer=150M

The next set of buffers are at a per client level. It is important to play around with these and get them just right for your machine. With the setting below, every active mysql client will have close to 3 MB's in buffers. So 100 clients = almost 300 MB. Giving too much to these buffers will be worse than giving too little. Nothing kills a server quite like memory swapping will.

sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=768K

The following directive should be set to 2X the number of processors in your machine for best performance.

thread_concurrency=2

Heres a few example configurations for servers running MySQL and web for common memory sizes. These are not perfect, but good starting points.

Server with 512MB RAM:

thread_cache_size=50
key_buffer=40M
table_cache=384
sort_buffer_size=768K
read_buffer_size=512K
read_rnd_buffer_size=512K
thread_concurrency=2

For servers with 1 GB ram:

thread_cache_size=80
key_buffer=150M
table_cache=512
sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=768K
thread_concurrency=2

########################################################

For optimizing mysql, first we need to know the values of mysql variables and status.
The following are some commands used for this purpose:
# mysqladmin processlist extended-status

or

mysql> show status;
mysql> show variables;

To get more specific answer, the commands can be enhanced a little more like as follows:

mysql> show status like '%Open%_tables';

mysql> show variables like 'table_cache';

1. The most important variables in mysql are table_cache and key_buffer_size

a) Run the above two commands and check Open_tables and Opened_tables
If Opened_tables is big, then your table_cache variable is probably
too small.

So increase the
table_cache variable. Open /etc/my.cnf and change/add table_cache=newvalue

b) Run the following commands to check key_buffer_size, key_read_requests and key_reads

mysql> show variables like '%key_buffer_size%';
mysql> show status like '%key_read%';

If key_reads / key_read_requests is < 0.01, key_buffer_size is enough. Otherwise key_buffer_size should be increased.

Also run the following command to check key_write_requests and key_writes

mysql> show status like '%key_write%';

If key_writes / key_write_requests is not less than 1 (near 0.5 seems to be fine), increase key_buffer_size.

Check the total size of all .MYI files. If it is larger than key_buffer_size change key_buffer_size to total size of MYI files.

2. Wait_timeout, max_connection, thread_cache

If you want to allow more connections, reduce wait_timeout to 15 seconds and increase max_connection as you want.

Check the number of idle connections. If it is too high reduce the
wait_timeout and use Thread_cache

How many threads we should keep in a cache for reuse. When a client disconnects, the client's threads are put in the cache if there aren't more than thread_cache_size threads from before. All new threads are first taken from the cache, and only when the cache is empty is a new thread created. This variable can be increased to improve performance if you have a lot of new connections. (Normally this doesn't give a notable performance improvement if you have a good thread implementation.) By examing the difference between the Connections and Threads_created you can see how efficient the current thread cache is for you.

If Threads_created is big, you may want to increase the
thread_cache_size variable. The cache hit rate can be calculated with
Threads_created/Connections.
Default thread_cache_size may be 0 if so increase it to 8.
You may try this formula : table_cache = opened table / max_used_connection

Monday, June 15, 2009

Hub, Switches, and Routers

Hub, Switches, and Routers
---------------------------

Hub
A common connection point for devices in a network. Hubs are commonly used to connect segments of a LAN. A hub contains multiple ports. When a packet arrives at one port, it is copied to the other ports so that all segments of the LAN can see all packets.

Switch
In networks, a device that filters and forwards packets between LAN segments. Switches operate at the data link layer (layer 2) and sometimes the network layer (layer 3) of the OSI Reference Model and therefore support any packet protocol. LANs that use switches to join segments are called switched LANs or, in the case of Ethernet networks, switched Ethernet LANs.

Router
A device that forwards data packets along networks. A router is connected to at least two networks, commonly two LANs or WANs or a LAN and its ISP.s network. Routers are located at gateways, the places where two or more networks connect. Routers use headers and forwarding tables to determine the best path for forwarding the packets, and they use protocols such as ICMP to communicate with each other and configure the best route between any two hosts.



The Differences Between These Devices on the Network
Today most routers have become something of a Swiss Army knife, combining the features and functionality of a router and switch/hub into a single unit. So conversations regarding these devices can be a bit misleading — especially to someone new to computer networking.

The functions of a router, hub and a switch are all quite different from one another, even if at times they are all integrated into a single device. Let's start with the hub and the switch since these two devices have similar roles on the network. Each serves as a central connection for all of your network equipment and handles a data type known as frames. Frames carry your data. When a frame is received, it is amplified and then transmitted on to the port of the destination PC. The big difference between these two devices is in the method in which frames are being delivered.

In a hub, a frame is passed along or "broadcast" to every one of its ports. It doesn't matter that the frame is only destined for one port. The hub has no way of distinguishing which port a frame should be sent to. Passing it along to every port ensures that it will reach its intended destination. This places a lot of traffic on the network and can lead to poor network response times.

Additionally, a 10/100Mbps hub must share its bandwidth with each and every one of its ports. So when only one PC is broadcasting, it will have access to the maximum available bandwidth. If, however, multiple PCs are broadcasting, then that bandwidth will need to be divided among all of those systems, which will degrade performance.

A switch, however, keeps a record of the MAC addresses of all the devices connected to it. With this information, a switch can identify which system is sitting on which port. So when a frame is received, it knows exactly which port to send it to, without significantly increasing network response times. And, unlike a hub, a 10/100Mbps switch will allocate a full 10/100Mbps to each of its ports. So regardless of the number of PCs transmitting, users will always have access to the maximum amount of bandwidth. It's for these reasons why a switch is considered to be a much better choice then a hub.

Routers are completely different devices. Where a hub or switch is concerned with transmitting frames, a router's job, as its name implies, is to route packets to other networks until that packet ultimately reaches its destination. One of the key features of a packet is that it not only contains data, but the destination address of where it's going.

A router is typically connected to at least two networks, commonly two Local Area Networks (LANs) or Wide Area Networks (WAN) or a LAN and its ISP's network . for example, your PC or workgroup and EarthLink. Routers are located at gateways, the places where two or more networks connect. Using headers and forwarding tables, routers determine the best path for forwarding the packets. Router use protocols such as ICMP to communicate with each other and configure the best route between any two hosts.

Today, a wide variety of services are integrated into most broadband routers. A router will typically include a 4 - 8 port Ethernet switch (or hub) and a Network Address Translator (NAT). In addition, they usually include a Dynamic Host Configuration Protocol (DHCP) server, Domain Name Service (DNS) proxy server and a hardware firewall to protect the LAN from malicious intrusion from the Internet.

All routers have a WAN Port that connects to a DSL or cable modem for broadband Internet service and the integrated switch allows users to easily create a LAN. This allows all the PCs on the LAN to have access to the Internet and Windows file and printer sharing services.

Some routers have a single WAN port and a single LAN port and are designed to connect an existing LAN hub or switch to a WAN. Ethernet switches and hubs can be connected to a router with multiple PC ports to expand a LAN. Depending on the capabilities (kinds of available ports) of the router and the switches or hubs, the connection between the router and switches/hubs may require either straight-thru or crossover (null-modem) cables. Some routers even have USB ports, and more commonly, wireless access points built into them.

Some of the more high-end or business class routers will also incorporate a serial port that can be connected to an external dial-up modem, which is useful as a backup in the event that the primary broadband connection goes down, as well as a built in LAN printer server and printer port.

Besides the inherent protection features provided by the NAT, many routers will also have a built-in, configurable, hardware-based firewall. Firewall capabilities can range from the very basic to quite sophisticated devices. Among the capabilities found on leading routers are those that permit configuring TCP/UDP ports for games, chat services, and the like, on the LAN behind the firewall.

So, in short, a hub glues together an Ethernet network segment, a switch can connect multiple Ethernet segments more efficiently and a router can do those functions plus route TCP/IP packets between multiple LANs and/or WANs; and much more of course.

DNS(Domain Name Service)


Domain Name Service
Host Names

Domain Name Service (DNS) is the service used to convert human readable names of hosts to IP addresses. Host names are not case sensitive and can contain alphabetic or numeric letters or the hyphen. Avoid the underscore. A fully qualified domain name (FQDN) consists of the host name plus domain name as in the following example:

computername.domain.com

The part of the system sending the queries is called the resolver and is the client side of the configuration. The nameserver answers the queries. Read RFCs 1034 and 1035. These contain the bulk of the DNS information and are superceded by RFCs 1535-1537. Naming is in RFC 1591. The main function of DNS is the mapping of IP addresses to human readable names.

Three main components of DNS

1. resolver
2. name server
3. database of resource records(RRs)

Domain Name System

The Domain Name System (DNS) is basically a large database which resides on various computers and it contains the names and IP addresses of various hosts on the internet and various domains. The Domain Name System is used to provide information to the Domain Name Service to use when queries are made. The service is the act of querying the database, and the system is the data structure and data itself. The Domain Name System is similar to a file system in Unix or DOS starting with a root. Branches attach to the root to create a huge set of paths. Each branch in the DNS is called a label. Each label can be 63 characters long, but most are less. Each text word between the dots can be 63 characters in length, with the total domain name (all the labels) limited to 255 bytes in overall length. The domain name system database is divided into sections called zones. The name servers in their respective zones are responsible for answering queries for their zones. A zone is a subtree of DNS and is administered separately. There are multiple name servers for a zone. There is usually one primary nameserver and one or more secondary name servers. A name server may be authoritative for more than one zone.

DNS names are assigned through the Internet Registries by the Internet Assigned Number Authority (IANA). The domain name is a name assigned to an internet domain. For example, mycollege.edu represents the domain name of an educational institution. The names microsoft.com and 3Com.com represent the domain names at those commercial companies. Naming hosts within the domain is up to individuals administer their domain.

Access to the Domain name database is through a resolver which may be a program or part of an operating system that resides on users workstations. In Unix the resolver is accessed by using the library functions "gethostbyname" and "gethostbyaddr". The resolver will send requests to the name servers to return information requested by the user. The requesting computer tries to connect to the name server using its IP address rather than the name.

Structure and message format

The drawing below shows a partial DNS hierarchy. At the top is what is called the root and it is the start of all other branches in the DNS tree. It is designated with a period. Each branch moves down from level to level. When referring to DNS addresses, they are referred to from the bottom up with the root designator (period) at the far right. Example: "myhost.mycompany.com.".



Partial DNS Hierarchy

DNS is hierarchical in structure. A domain is a subtree of the domain name space. From the root, the assigned top-level domains in the U.S. are:

* GOV - Government body.
* EDU - Educational body.
* INT - International organization
* NET - Networks
* COM - Commercial entity.
* MIL - U. S. Military.
* ORG - Any other organization not previously listed.

Outside this list are top level domains for various countries.

Each node on the domain name system is separated by a ".". Example: "mymachine.mycompany.com.". Note that any name ending in a "." is an absolute domain name since it goes back to root.
DNS Message format:

Bits Name Description
0-15 Identification Used to match responses to requests. Set by client and returned by server.
16-31 Flags Tells if query or response, type of query, if authoritative answer, if truncated, if recursion desired, and if recursion is available.
32-47 Number of questions
48-63 Number of answer RRs
64-79 Number of authority RRs
80-95 Number of additional RRs
96-?? Questions - variable lengths There can be variable numbers of questions sent.
??-?? Answers - variable lengths Answers are variable numbers of resource records.
??-?? Authority - variable lengths
??-?? Additional Information - variable lengths

Question format includes query name, query type and query class. The query name is the name being looked up. The query class is normally 1 for internet address. The query types are listed in the table below. They include NS, CNAME, A, etc.

The answers, authority and additional information are in resource record (RR) format which contains the following.

1. Domain name
2. Type - One of the RR codes listed below.
3. Class - Normally indicates internet data which is a 1.
4. Time to live field - The number of seconds the RR is saved by the client.
5. Resource data length specifies the amount of data. The data is dependent on its type such as CNAME, A, NS or others as shown in the table below. If the type is "A" the data is a 4 byte IP address.

The table below shows resource record types:

Type RR value Description
A 1 Host's IP address
NS 2 Host's or domain's name server(s)
CNAME 5 Host's canonical name, host identified by an alias domain name
PTR 12 Host's domain name, host identified by its IP address
HINFO 13 Host information
MX 15 Host's or domain's mail exchanger
AXFR 252 Request for zone transfer
ANY 255 Request for all records
Usage and file formats

If a domain name is not found when a query is made, the server may search for the name elsewhere and return the information to the requesting workstation, or return the address of a name server that the workstation can query to get more information. There are special servers on the Internet that provide guidance to all name servers. These are known as root name servers. They do not contain all information about every host on the Internet, but they do provide direction as to where domains are located (the IP address of the name server for the uppermost domain a server is requesting). The root name server is the starting point to find any domain on the Internet.
Name Server Types

There are three types of name servers:

1. The primary master builds its database from files that were preconfigured on its hosts, called zone or database files. The name server reads these files and builds a database for the zone it is authoritative for.
2. Secondary masters can provide information to resolvers just like the primary masters, but they get their information from the primary. Any updates to the database are provided by the primary.
3. Caching name server - It gets all its answers to queries from other name servers and saves (caches) the answers. It is a non-authoritative server.

The caching only name server generates no zone transfer traffic. A DNS Server that can communicate outside of the private network to resolve a DNS name query is referred to as forwarder.
DNS Query Types

There are two types of queries issued:

1. Recursive queries received by a server forces that server to find the information requested or post a message back to the querier that the information cannot be found.
2. Iterative queries allow the server to search for the information and pass back the best information it knows about. This is the type that is used between servers. Clients used the recursive query.
3. Reverse - The client provides the IP address and asks for the name. In other queries the name is provided, and the IP address is returned to the client. Reverse lookup entries for a network 192.168.100.0 is "100.168.192.in-addr arpa".

Generally (but not always), a server-to-server query is iterative and a client-resolver-to-server query is recursive. You should also note that a server can be queried or it can be the person placing a query. Therefore, a server contains both the server and client functions. A server can transmit either type of query. If it is handed a recursive query from a remote source, it must transmit other queries to find the specified name, or send a message back to the originator of the query that the name could not be found.
DNS Transport protocol

DNS resolvers first attempt to use UDP for transport, then use TCP if UDP fails.
The DNS Database

A database is made up of records and the DNS is a database. Therefore, common resource record types in the DNS database are:

* A - Host's IP address. Address record allowing a computer name to be translated into an IP address. Each computer must have this record for its IP address to be located. These names are not assigned for clients that have dynamically assigned IP addresses, but are a must for locating servers with static IP addresses.
* PTR - Host’s domain name, host identified by its IP address
* CNAME - Host’s canonical name allows additional names or aliases to be used to locate a computer.
* MX - Host’s or domain’s mail exchanger.
* NS - Host’s or domain’s name server(s).
* SOA - Indicates authority for the domain
* TXT - Generic text record
* SRV - Service location record
* RP - Responsible person
* HINFO - Host information record with CPU type and operating system.

When a resolver requests information from the server, the DNS query message indicates one of the preceding types.
DNS Files

* CACHE.DNS - The DNS Cache file. This file is used to resolve internet DNS queries. On Windows systems, it is located in the WINNTROOT\system32\DNS directory and is used to configure a DNS server to use a DNS server on the internet to resolve names not in the local domain.

Example Files

Below is a partial explanation of some records in the database on a Linux based system. The reader should view this information because it explains some important DNS settings that are common to all DNS servers. An example /var/named/db.mycompany.com.hosts file is listed below.

mycompany.com. IN SOA mymachine.mycompany.com. root.mymachine.mycompany.com. (
1999112701 ; Serial number as date and two digit number YYMMDDXX
10800 ; Refresh in seconds 28800=8H
3600 ; Retry in seconds 7200=2H
604800 ; Expire 3600000=1 week
86400 ) ; Minimum TTL 86400=24Hours
mycompany.com. IN NS mymachine.mycompany.com.
mycompany.com. IN MX 10 mailmachine.mycompany.com.
mymachine.mycompany.com. IN A 10.1.0.100
mailmachine.mycompany.com. IN A 10.1.0.4
george.mycompany.com. IN A 10.1.3.16

A Line by line description is as follows:

1. The entries on this line are:
1. mycompany.com. - Indicates this server is for the domain mycompany.com.
2. IN - Indicates Internet Name.
3. SOA - Indicates this server is the authority for its domain, mycompany.com.
4. mymachine.mycompany.com. - The primary nameserver for this domain.
5. root.mymachine.mycompany.com. - The person to contact for more information.
The lines in the parenthesis, listed below, are for the secondary nameserver(s) which run as slave(s) to this one (since it is the master).
2. 1999112701 - Serial number - If less than master's SN, the slave will get a new copy of this file from the master.
3. 10800 - Refresh - The time in seconds between when the slave compares this file's SN with the master.
4. 3600 - Retry - The time the server should wait before asking again if the master fails to respond to a file update (SOA request).
5. 604800 - Expire - Time in seconds the slave server can respond even though it cannot get an updated zone file.
6. 86400 - TTL - The time to live (TTL) in seconds that a resolver will use data received from a nameserver before it will ask for the same data again.
7. This line is the nameserver resource record. There may be several of these if there are slave name servers.

mycompany.com. IN NS mymachine.mycompany.com.

Add any slave server entries below this like:

mycompany.com. IN NS ournamesv1.mycompany.com.
mycompany.com. IN NS ournamesv2.mycompany.com.
mycompany.com. IN NS ournamesv3.mycompany.com.

8. This line indicates the mailserver record.

mycompany.com. IN MX 10 mailmachine.mycompany.com.

There can be several mailservers. The numeric value on the line indicates the preference or precedence for the use of that mail server. A lower number indicates a higher preference. The range of values is from 0 to 65535. To enter more mailservers, enter a new line for each one similar to the nameserver entries above, but be sure to set the preferences value correctly, at different values for each mailserver.
9. The rest of the lines are the name to IP mappings for the machines in the organization. Note that the nameserver and mailserver are listed here with IP addresses along with any other server machines required for your network.

mymachine.mycompany.com. IN A 10.1.0.100
mailmachine.mycompany.com. IN A 10.1.0.4
george.mycompany.com. IN A 10.1.3.16

Domain names written with a dot on the end are absolute names which specify a domain name exactly as it exists in the DNS hierarchy from the root. Names not ending with a dot may be a subdomain to some other domain.

Aliases are specified in lines like the following:

mymachine.mycompany.com IN CNAME nameserver.mycompany.com.
george.mycompany.com IN CNAME dataserver.mycompany.com.
Linux1.mycompany.com IN CNAME engserver.mycompany.com.
Linux2.mycompany.com IN CNAME mailserver.mycompany.com.

When a client (resolver) sends a request, if the nameserver finds a CNAME record, it replaces the requested name with the CNAME, then finds the address of the CNAME value, and return this value to the client.

A host that has more than one network card which is set to address two different subnets can have more than one address for a name.

mymachine.mycompany.com IN A 10.1.0.100
IN A 10.1.1.100

When a client queries the nameserver for the address of a multi homed host, the nameserver will return the address that is closest to the client address. If the client is on a different network than both the subnet addresses of the multi homed host, the server will return both addresses.

For more information on practical application of DNS, read the DNS section of the Linux User's Guide.

Sunday, June 14, 2009

ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

[root@sylesh ~]# mysql -u root
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)

>>disabling password authentication
service mysql stop

wait until MySQL shuts down. Then run

mysqld_safe --skip-grant-tables &

then you will be able to login as root with no password.

mysql -uroot mysql

In MySQL command line prompt issue the following command:
use databasename;

UPDATE user SET password=PASSWORD("abcd") WHERE user="root";
FLUSH PRIVILEGES;
EXIT

/etc/init.d/mysqld restart

At this time your root password is reset to "abcd" and MySQL will now
know the privileges and you'll be able to login with your new password:

mysql -uroot -p mysql

Friday, June 12, 2009

How to enable SSI On Your Server with .htaccess and XBitHack apache directive

>>>

The below notes will demonstrates how to enable SSI on your server using .htaccess.

If you are paying for hosting services you may need to get permission from your host to make sure you are not violating their Terms of Service which could result in you getting the boot! Every decent host supports SSI but double-check to make sure.


To enable SSI either create a file simple called .htaccess or edit your existing .htaccess file and place the following code in it:

AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes

Note: to enable SSI for your full web site place the .htaccess in the root directory of your site; to enable it for just a certain directory place the .htaccess file only in that particular directory.

The first line of the code above tells the server that .shtml is a valid extension. The second line adds a handler to all pages with the .shtml extension which tells the server to parse (process) the document for server side includes.

If you prefer you can use a different file extension for your files which you want parsed for server side includes. Simply change the .shtml to .shtm etc. If you also want your .htm documents parsed by the server (so you don't need to rename all your files) simply add the following after the first line of the code above:

AddHandler server-parsed .htm


If you want to use SSI in your default directory page, such as index.shtml you may (but normally won't) need to add the following to the .htaccess file:

DirectoryIndex index.shtml index.htm

This means that index.shtml can be your default page. If this page is not found the server will look for index.htm etc. More on this in the .htaccess guides section.

>>SSI Without .shtml

In order to understand what this use of htaccess can do for you, you have to understand what SSI directives are. (SSI directives are covered in the How To Use Your CGI-BIN page.) You can put an SSI directive tag in your Web page, but that doesn't mean the server will look for it. Looking through an html file for SSI directives is called "parsing", and by default a server doesn't parse every html file. It only parses pages that have a .shtml extension.

Dilemma:

You want to start using SSI directives in your Web pages to call a script or display certain things on the pages. Your host requires that pages with SSI directives have a .shtml extension. However, over time all of your pages have been linked to and indexed by search engines using their current .html extensions. If you change the extensions to comply with your host, a lot of people will start getting 404 errors.

htaccess to the rescue! Certain htaccess statements allow you to tell the server to parse certain pages that don't have a .shtml extension.

If you created the htaccess.txt file above, simply add the statements given below to it and re-ftp/rename it. If you didn't, here are the steps:

1. Use a text editor to create an htaccess.txt file and enter the following statements into it:


AddType text/html .html
AddHandler server-parsed .html


replacing .html with .htm if that's what you are using for your pages.

2. Save the file and ftp it (using ASCII mode) to your Web root directory (or whatever directory your index.html file is in).

3. Rename the htaccess.txt file on the server to .htaccess

4. Try it out by entering a URL for one of the pages that contains an SSI directive and see if it's working.

The above can be thought of as the "directory method" method for enabling SSI parsing because all files in the directory with the specified extension will be parsed, including files in any sub-directories. SSI parsing does have a small performance price due to all this parsing. If your site has a lot of traffic and a lot of pages that performance price could add up. What if you have a lot of traffic and a lot of pages but you only have a few files that you want parsed? Then you'd want to use XBitHack which is covered in the next section.

Not all hosts allow you to use a .htaccess file. They have to use an AllowOverride statement in one of the global configuration files. Ask your host, or a potential host, if they allow the use of .htaccess files. If so, also ask if they allow the use of XbitHack. If they so 'No' to the question of htaccess, pleading with them to enable it on your server may work, especially if you sound like you know what you're talking about (which this page will help you to do).

A .htaccess file is a very powerful tool. You can use it to set up password-protected directories, change the way Apache responds to certain events, etc. The flip side of that is that you can really hose things up or give unintended access to visitors if you're not careful. You may want to try out your attempts with .htaccess during low-traffic times on your Website so that any problems can be corrected without affecting too many visitors.

Note also that the very fact that this is a very powerful tool may be reason enough for some hosting services not to allow you to use it. A hosting service sets up multiple "virtual" Web servers so multiple domains can be hosting on a single system (each domain having it's own virtual Web server). They do this by adding statements (aka directives) to the main Apache configuration file (named httpd.conf). When they add these virtual server directives they must include the directive to enable htaccess functionality. If you try the above and it doesn't work, chances are good your host doesn't have the htaccess function enabled.


What is XBitHack
----------------

XBitHack (pronounced "X bit hack") is simply one of those htaccess configuration statements mentioned above. If you're not willing to put up with the performance costs of the "directory method" for enabling parsing of non-.shtml pages covered above, think of XBitHack as a "file method". This is because you can specify on a file-by-file basis which non-.shtml files get parsed.

Using XBitHack for this "file method" has two steps:

* turn on XBitHack by adding the statement to your .htaccess file
* "flag" the html pages you want parsed by changing their permissions to something a little out of the ordinary



If you created the htaccess.txt file above, simply add the statement given below to it and re-ftp/rename it to enable XBitHack. If your .htaccess file contains the AddType and AddHandler statements from above, REMOVE THEM. If you didn't create the file earlier, here are the steps to enabling XBitHack:

1. Use a text editor to create an htaccess.txt file and enter the following statement into it:


XBitHack on


2. Save the file and ftp it (using ASCII mode) to your Web root directory (or whatever directory your index.html file is in).

3. Rename the htaccess.txt file to .htaccess

4. CHMOD the page files, and only the page files, that you want parsed (i.e. that will contain SSI directives) to 744 (instead of 644). This is what tells the server to parse the page.

5. Try it out by entering a URL for one of the pages that contains an SSI directive and see if it's working.

If it doesn't work, check your error log for a message like

XBitHack not allowed here



It is possible that your host allows htaccess but not XBitHack. If you don't find the above error, you'll have to contact your host's technical support operation. However, by knowing what htaccess and XBitHack are, you can ask them intelligent questions regarding your problem. When they realize you know what you are talking about, they will be less likely to feed you a line of BS. Also, don't be surprised if the support person you speak to doesn't know what you are talking about. First-line technical support and sales people are usually entry-level jobs in an organization. If you get the sense they don't know what you are talking about, ask to speak to a more senior support person who does.




Wednesday, June 10, 2009

dbmmanage - Manage user authentication files in DBM format(apache binary)

DBM User Authentication

This week, we explain how to store user authentication information in DBM files for faster access when you have thousands of users.

The feature on User Authentication shows how to restrict pages to selected people. We showed how to use the htpasswd program to create the necessary .htpasswd files, and how to create group files to provide more control over the users. We also said that .htpasswd files and group files like this are not very efficient when a large number of users are involved. This is because these are plain text files and for every request in the authenticated area Apache has to read through the file looking for the user. A much faster way to store the user information is to use files in DBM format. This article explains how to create and manage DBM format user authentication files.

What is DBM?

DBM files are a simple and relatively standard method of storing information for quick retrieval. Each item of information stored in a DBM file consists of two parts: a key and a value. If you know the key you can access the value very quickly. The DBM file maintains an 'index' of the keys, each of which points to where the value is stored within the file, and the index is usually arranged such that values can be accessed with the minimum number of file system accesses even for very large numbers of keys.

In practice, on many systems a DBM 'file' is actually stored in two files on the disk. If, for example, a DBM file called 'users' is created, it will actually be stored in files called users.pag and users.dir. If you ever need to rename or delete a DBM from the command line, remember to change both the files, keeping the extensions (.pag and .dir) the same. Some newer versions of DBM only create one file.

Provided the key is known in advance DBM format files are a very efficient way of accessing information associated with that key. For web user authentication, the key will be the username, and the value will store their (encrypted) password. Looking up usernames and their passwords in a DBM file will be more efficient than using a plain text file when more than a few users are involved. This will be particularly important for sites with lots of users (say, over 10,000) or where there are lots of accesses to authenticated pages.

Preparing Apache for DBM Files

If you want to use DBM format files with Apache, you will need to make sure it is compiled with DBM support. By default, Apache cannot use DBM files for user authentication, so the optional DBM authentication module needs to be included. Note that this is included in addition to the normal user authentication module (which uses plain text files, as explained in the previous article). It is possible to have support for multiple file formats compiled into Apache at the same time.

To add the DBM authentication module, edit your Configuration file in the Apache src directory. Remove the comment from the line which currently says

  # Module dbm_auth_module     mod_auth_dbm.o

To remove the comment, delete the # and space character at the right-hand end of the line. Now update the Apache configuration by running ./Configure, then re-make the executable with make.

However, before compiling you might also need to tell Apache where to find the DBM functions. On some systems this is automatic. On others you will need to add the text -lndbm or -ldbm to the EXTRA_LIBS line in the Configuration file. (Apache 1.2 will attempt to do this automatically if needed, but you might still need to configure it manually in some cases). If you are not sure what your system requires, try leaving it blank and compiling. If at the end of the compilation you see errors about functions such as _dbm_fetch() not being found, try each of these choices in turn. (Remember to re-run ./Configure after changing Configuration). If you still cannot get it to compile, you might have a system where the DBM library is installed in a non-standard directory, or where the there is no DBM library available. You could either contact you system administrator, or download and compile your own copy of the DBM libraries.

Creating A DBM Users File

For standard (htpasswd) user authentication password files, the program htpasswd is used to add new users and set their passwords. To create and manage DBM format user files another program from the Apache support directory is used. The program is called dbmmanage and is written in perl (so you will need perl on your system, and it will need to have been compiled with support for the same DBM library you compiled into Apache. If you have only just installed DBM on your system you will might need to re-compile perl to build in DBM support).

This program can be used to create a new DBM file, add users and passwords to it, change passwords, or delete users. To start by creating a new DBM file and adding a user to it, run the command:

  dbmmanage /usr/local/etc/httpd/usersdbm adduser martin hamster

The creates the DBM file /usr/local/etc/httpd/usersdbm (which might actually consist of /usr/local/etc/httpd/usersdbm.dir and /usr/local/etc/httpd/usersdbm.pag), if it does not already exist. It then adds the user 'martin' with password 'hamster'. This command can be used with other usernames and passwords to add more users, or with an existing username to change that user's password. A user can be deleted from the password file with

   dbmmanage /usr/local/etc/httpd/usersdbm delete martin

You can get a list of all the users in the DBM file with

   dbmmanage /usr/local/etc/httpd/usersdbm view

Restricting a Directory

Now you have a DBM user authentication file with some users in it, you are ready to create an authenticated area. You can restrict a directory either using a section in access.conf or by using a .htaccess file. The feature on user authentication explained how you can set up a basic .htaccess file, using this example:

  AuthName "restricted stuff"
AuthType Basic
AuthUserFile /usr/local/etc/httpd/users

require valid-user

To use DBM files, the only change is to replace the directive AuthUserFile line with

  AuthDBMUserFile /usr/local/etc/httpd/usersdbm

This single change tells Apache that the user file is now in a DBM format, rather than plain text. All the rest of the user authentication setup remains the same (so the authentication type is still Basic, and the syntax of require is the same as before).

Using Groups

Each user can be in one or more "groups", and you can restrict access to people just in a specified group. This makes it possible to manage all your users on your site in a single database, and customise the areas that each can access. The use of DBM files for storing group information is particularly efficient because you can use the same file to store both password and group information.

The dbmmanage command can be used to set group information for users. For example, to add the user "martin" to the group "staff", you would use

  dbmmanage /usr/local/etc/httpd/users adduser martin hamster staff

You put a user into multiple groups but listing them, separated by commas. For example,

  dbmmanage /usr/local/etc/httpd/users adduser martin hamster staff,admin

Note that dbmmanage has to be told the password as well, and there is no way to set or change group information for a user without knowing their password. This means in practice that dbmmanage is not suitable for managing users in groups, and you will have to write your own management scripts. Some help writing perl to manage DBM files is given later in this article.

After creating a user and group file containing details of which users are in which groups, you can restrict access by these groups. For example, to restrict access to an area to only people in the group staff, you could use:

  AuthName "restricted stuff"
AuthType Basic
AuthDBMUserFile /usr/local/etc/httpd/users
AuthDBMGroupFile /usr/local/etc/httpd/users

require group staff

Custom Management of DBM Files

The supplied dbmmanage script to manage DBM files is adequate for basic editing, but cannot handle advanced use, such as managing group information. It is also command line driven, while a Web interface might be a better choice in many situations. To do either of these things you will have to write programs to manage DBM files yourself. Using perl this is not too difficult.

As a simple example, say you have an existing .htpasswd file and you want to convert it to a DBM file, putting all the users in a specific group. We will introduce the concepts here, and there is a link below to the completed program for you to download. It will be written in Perl which is quick to write and easy to customise, although the principles of DBM use are the same whatever language is used.

The basic way to look in a DBM file is given here. DBM files are opened in Perl as 'hashed arrays'. The "key" is the user name, and the value is the encrypted password and optionally group information. A simple script to lookup all the keys and values in a DBM is:

  dbmopen(%DBM, "/usr/local/etc/httpd", 0644) ||
die "Cannot open file: $!\n";
while (($key, $value) = each %DBM) {
print "key=$key, value=$value\n";
}
dbmclose(%DBM);

Note that if the given DBM file does not exist, it will be created. This script will work with both perl 4 and perl 5 (although Perl 5 users might prefer to use the new tie facility instead of dbmopen). To lookup a known key you would use:

  $key = "martin";

dbmopen(%DBM, "/usr/local/etc/httpd", 0644) ||
die "Cannot open file: $!\n";
$value = $DBM{$key};
if (!defined($value)) {
print "$key not stored\n";
} else {
print "key=$key, value=$value\n";
}
dbmclose(%DBM);

Now we can write a script to convert a htpasswd file into a DBM database, optionally putting each user into one or more groups. The script is htpasswd2dbm.pl , and is used like this:

  cd /usr/local/etc/httpd
htpasswd2dbm.pl -htpasswd users usersdbm

The -htpasswd option specifies the htpasswd file to be read, the the final argument is the DBM file to create (or add to). To set a group, use the -group argument. For example, to put all the users from this file into the groups admin and staff, use

  htpasswd2dbm.pl -htpasswd users -group admin,staff usersdbm

The program will add users to an existing DBM database, so it can be used to merge multiple htpasswd files. If you give users from different files different groups, you will be able to set up access restrictions on a group-by-group basis, and manage all your users in one database. Note that if there is already a user with the same username in the DBM file it will be overwritten by the new information.

Group information stored in a DBM file as part of the value. If no group information is stored, the key associated with a username just consists of the encrypted password. To store group information, the encrypted password is followed by a colon, then a list of groups that the user is in, each separated by a comma. So a typical key might look like this:

  E7yT67YGht65:admin,staff

A program written in perl can easily extract the group information, for example:

  $value = $DBM{$key};
($enc, $groupfield) = split(/:/, $value);
@groups = split(/,/, $groupfield);

It is also possible to store additional information in the DBM file, by following the groups list with a colon. Apache will ignore any data after a colon following the groups list, so it could be used, for example, to store the real name and contact details for the user, and an expiry date. This could be stored in the DBM like this:

  $DBM{$key} = join(":", $enc, join(",", @groups),
$realname, $company, $emailaddr,
$expdate);

Keeping all the user information together in a database like this, which Apache can also use for user authentication, can make administering a site with many users simpler.

Thursday, May 28, 2009

TCP Wrapper

TCP Wrapper is a host-based Networking ACL system, used to filter network access to Internet Protocol servers on (Unix-like) operating systems such as Linux or BSD. It allows host or subnetwork IP addresses, names and/or ident query replies, to be used as tokens on which to filter for access control purposes.


TCP Wrappers Configuration Files

(From redhat.com)

To determine if a client machine is allowed to connect to a service, TCP wrappers reference the following two files, which are commonly referred to as hosts access files:

  • /etc/hosts.allow

  • /etc/hosts.deny

When a client request is received by a TCP wrapped service, it takes the following basic steps:

  1. The service references /etc/hosts.allow. — The TCP wrapped service sequentially parses the /etc/hosts.allow file and applies the first rule specified for that service. If it finds a matching rule, it allows the connection. If not, it moves on to step 2.

  2. The service references /etc/hosts.deny. — The TCP wrapped service sequentially parses the /etc/hosts.deny file. If it finds a matching rule is denies the connection. If not, access to the service is granted.

The following are important points to consider when using TCP wrappers to protect network services:

  • Because access rules in hosts.allow are applied first, they take precedence over rules specified in hosts.deny. Therefore, if access to a service is allowed in hosts.allow, a rule denying access to that same service in hosts.deny is ignored.

  • Since the rules in each file are read from the top down and the first matching rule for a given service is the only one applied, the order of the rules is extremely important.

  • If no rules for the service are found in either file, or if neither file exists, access to the service is granted.

  • TCP wrapped services do not cache the rules from the hosts access files, so any changes to hosts.allow or hosts.deny take effect immediately without restarting network services.

15.2.1. Formatting Access Rules

The format for both /etc/hosts.allow and /etc/hosts.deny are identical. Any blank lines or lines that start with a hash mark (#) are ignored, and each rule must be on its own line.

Each rule uses the following basic format to control access to network services:

:  [: : : ...]

The following is a basic sample hosts access rule:

vsftpd : .example.com 

This rule instructs TCP wrappers to watch for connections to the FTP daemon (vsftpd) from any host in the example.com domain. If this rule appears in hosts.allow, the connection will be accepted. If this rule appears in hosts.deny, the connection will be rejected.

The next sample hosts access rule is more complex and uses two option fields:

sshd : .example.com  \
: spawn /bin/echo `/bin/date` access denied>>/var/log/sshd.log \
: deny

Note that in this example that each option field is preceded by the backslash (\). Use of the backslash prevents failure of the rule due to length.

WarningWarning

If the last line of a hosts access file is not a newline character (created by pressing the [Enter] key), the last rule in the file will fail and an error will be logged to either /var/log/messages or /var/log/secure. This is also the case for a rule lines that span multiple lines without using the backslash. The following example illustrates the relevant portion of a log message for a rule failure due to either of these circumstances:

warning: /etc/hosts.allow, line 20: missing newline or line too long

This sample rule states that if a connection to the SSH daemon (sshd) is attempted from a host in the example.com domain, execute the echo command (which will log the attempt to a special file), and deny the connection. Because the optional deny directive is used, this line will deny access even if it appears in the hosts.allow file. For a more detailed look at available options, see Section 15.2.3 Option Fields.

15.2.1.1. Wildcards

Wildcards allow TCP wrappers to more easily match groups of daemons or hosts. They are used most frequently in the client list field of access rules.

The following wildcards may be used:

  • ALL — Matches everything. It can be used for both the daemon list and the client list.

  • LOCAL — Matches any host that does not contain a period (.), such as localhost.

  • KNOWN — Matches any host where the hostname and host address are known or where the user is known.

  • UNKNOWN — Matches any host where the hostname or host address are unknown or where the user is unknown.

  • PARANOID — Matches any host where the hostname does not match the host address.

CautionCaution

The KNOWN, UNKNOWN, and PARANOID wildcards should be used with care as a disruption in name resolution may prevent legitimate users from gaining access to a service.

15.2.1.2. Patterns

Patterns can be used in the client list field of access rules to more precisely specify groups of client hosts.

The following is a list of the most common accepted patterns for a client list entry:

  • Hostname beginning with a period (.) — Placing a period at the beginning of a hostname, matches all hosts sharing the listed components of the name. The following example would apply to any host within the example.com domain:

    ALL : .example.com
  • IP address ending with a period (.) — Placing a period at the end of an IP address matches all hosts sharing the initial numeric groups of an IP address. The following example would apply to any host within the 192.168.x.x network:

    ALL : 192.168.
  • IP address/netmask pair — Netmask expressions can also be used as a pattern to control access to a particular group of IP addresses. The following example would apply to any host with an address of 192.168.0.0 through 192.168.1.255:

    ALL : 192.168.0.0/255.255.254.0
  • The asterisk (*) — Asterisks can be used to match entire groups of hostnames or IP addresses, as long as they are not mixed in a client list containing other types of patterns. The following example would apply to any host within the example.com domain:

    ALL : *.example.com
  • The slash (/) — If a client list begins with a slash, it is treated as a file name. This is useful if rules specifying large numbers of hosts are necessary. The following example refers TCP wrappers to the /etc/telnet.hosts file for all Telnet connections:

    in.telnetd : /etc/telnet.hosts

Other, lesser used patterns are also accepted by TCP wrappers. See the hosts access man 5 page for more information.

WarningWarning

Be very careful when creating rules requiring name resolution, such as hostnames and domain names. Attackers can use a variety of tricks to circumvent accurate name resolution. In addition, any disruption in DNS service would prevent even authorized users from using network services.

It is best to use IP addresses whenever possible.

15.2.1.3. Operators

At present, access control rules accept one operator, EXCEPT. It can be used in both the daemon list and the client list of a rule.

The EXCEPT operator allows specific exceptions to broader matches within the same rule.

In the following example from a hosts.allow file, all example.com hosts are allowed to connect to all services except cracker.example.com:

ALL: .example.com EXCEPT cracker.example.com

In the another example from a hosts.allow file, clients from the 192.168.0.x network can use all services except for FTP:

ALL EXCEPT vsftpd: 192.168.0.
NoteNote

Organizationally, it is often easier to use EXCEPT operators sparingly, placing the exceptions to a rule in the other access control file. This allows other administrators to quickly scan the appropriate files to see what hosts should are allowed or denied access to services, without having to sort through the various EXCEPT operators.

15.2.2. Portmap and TCP Wrappers

When creating access control rules for portmap, do not use hostnames as its implementation of TCP wrappers does not support host look ups. For this reason, only use IP addresses or the keyword ALL when specifying hosts is in hosts.allow or hosts.deny.

In addition, changes to portmap access control rules may not take affect immediately.

Widely used services, such as NIS and NFS, depend on portmap to operate, so be aware of these limitations.

15.2.3. Option Fields

In addition to basic rules allowing and denying access, the Red Hat Linux implementation of TCP wrappers supports extensions to the access control language through option fields. By using option fields within hosts access rules, administrators can accomplish a variety of tasks such as altering log behavior, consolidating access control, and launching shell commands.

15.2.3.1. Logging

Option fields let administrators easily change the log facility and priority level for a rule by using the severity directive.

In the following example, connections to the SSH daemon from any host in the example.com domain are logged to the the default authpriv facility (because no facility value is specified) with a priority of emerg:

sshd : .example.com : severity emerg

It is also possible to specify a facility using the severity option. The following example logs any SSH connection attempts by hosts from the example.com domain to the local0 facility with a priority of alert:

sshd : .example.com : severity local0.alert
NoteNote

In practice, this example will not work until the syslog daemon (syslogd) is configured to log to the local0 facility. See the syslog.conf man page for information about configuring custom log facilities.

15.2.3.2. Access Control

Option fields also allow administrators to explicitly allow or deny hosts in a single rule by adding the allow or deny directive as the final option.

For instance, the following two rules allow SSH connections from client-1.example.com, but deny connections from client-2.example.com:

sshd : client-1.example.com : allow
sshd : client-2.example.com : deny

By allowing access control on a per-rule basis, the option field allows administrators to consolidate all access rules into a single file: either hosts.allow or hosts.deny. Some consider this an easier way of organizing access rules.

15.2.3.3. Shell Commands

Option fields allow access rules to launch shell commands through the following two directives:

  • spawn — Launches a shell command as a child process. This option directive can perform tasks like using /usr/sbin/safe_finger to get more information about the requesting client or create special log files using the echo command.

    In the following example, clients attempting to access Telnet services from the example.com domain are quietly logged to a special file:

    in.telnetd : .example.com \
    : spawn /bin/echo `/bin/date` from %h>>/var/log/telnet.log \
    : allow
  • twist — Replaces the requested service with the specified command. This directive is often used to set up traps for intruders (also called "honey pots"). It can also be used to send messages to connecting clients. The twist command must occur at the end of the rule line.

    In the following example, clients attempting to access FTP services from the example.com domain are sent a message via the echo command:

    vsftpd : .example.com \
    : twist /bin/echo "421 Bad hacker, go away!"

For more information about shell command options, see the hosts_options man page.

15.2.3.4. Expansions

Expansions, when used in conjunction with the spawn and twist directives provide information about the client, server, and processes involved.

Below is a list of supported expansions:

  • %a — The client's IP address.

  • %A — The server's IP address.

  • %c — Supplies a variety of client information, such as the username and hostname, or the username and IP address.

  • %d — The daemon process name.

  • %h — The client's hostname (or IP address, if the hostname is unavailable).

  • %H — The server's hostname (or IP address, if the hostname is unavailable).

  • %n — The client's hostname. If unavailable, unknown is printed. If the client's hostname and host address do not match, paranoid is printed.

  • %N — The server's hostname. If unavailable, unknown is printed. If the server's hostname and host address do not match, paranoid is printed.

  • %p — The daemon process ID.

  • %s — Various types of server information, such as the daemon process and the host or IP address of the server.

  • %u — The client's username. If unavailable, unknown is printed.

The following sample rule uses an expansion in conjunction with the spawn command to identify the client host in a customized log file.

It instructs TCP wrappers that if a connection to the SSH daemon (sshd) is attempted from a host in the example.com domain, execute the echo command to log the attempt, including the client hostname (using the %h expansion), to a special file:

sshd : .example.com  \
: spawn /bin/echo `/bin/date` access denied to %h>>/var/log/sshd.log \
: deny

Similarly, expansions can be used to personalize messages back to the client. In the following example, clients attempting to access FTP services from the example.com domain are informed that they have been banned from the server:

vsftpd : .example.com \
: twist /bin/echo "421 %h has been banned from this server!"

For a full explanation of available expansions, as well as additional access control options, review see section 5 of the man pages for hosts_access (man 5 hosts_access) and the man page for hosts_options.

PAM (Pluggable authentication module)

Note: this document is written in reference to Red Hat Linux 6.2+

PAM (Pluggable authentication module) is very diverse in the types of modules it provides. One could accomplish many authentication tasks using PAM. However PAM expands itself beyond typical authentication programs, as it allows an admin to employ other system-critical features such as resource limiting, su protection, and TTY restrictions. Much of PAM's features are not within the scope of this document, but for further reading you can refer to the links at the bottom of this document.

Firstly we must enable the pam_limits module, inside /etc/pam.d/login. Add the following to the end of the file:

session required /lib/security/pam_limits.so

After adding the line above, the /etc/pam.d/login file should look something like this:

#%PAM-1.0
auth required /lib/security/pam_securetty.so
auth required /lib/security/pam_stack.so service=system-auth
auth required /lib/security/pam_nologin.so
account required /lib/security/pam_stack.so service=system-auth
password required /lib/security/pam_stack.so service=system-auth
session required /lib/security/pam_stack.so service=system-auth
session optional /lib/security/pam_console.so
session required /lib/security/pam_limits.so

The limits.conf file located under the /etc/security directory can be used to control and set resource policies. limits.conf is well commented and easy to use - so do take the time to skim over its contents. It is important to set resource limits on all your users so they can't perform denial of service attacks with such things as fork bombs, amongst other things it can also stop 'stray' server processes from taking the system down with it.

It is also a good idea to separate rules for users, admins, and other (other being everything else). This is important, cause take for instance a scenario where a user fork bombs the system - it could in effect disable an administrator's ability to login to the system and take proper actions, or worse crash the server.

Below is the default policy used on a server iv configured:

# For everyone (users and other)
* hard core 0
* - maxlogins 12
* hard nproc 50
* hard rss 20000

# For group wheel (admins)
@wheel - maxlogins 5
@wheel hard nproc 80
@wheel hard rss 75000

#End of file

The first set of rules say to prohibit the creation of core files - core 0 , restrict the number of processes to 50 - nproc 50, restrict logins to 12 - maxlogins 12, and restrict memory usage to 20MB - rss 20000 for everyone except the super user. The the later rules for admins, say to restrict logins to 5 - maxlogins 5, restrict the number of processes to 80 - nproc 80, and restrict the memory usage to 75MB - rss 75000.

All the above only concerns users who have entered via the login prompt on your system. The asterisk (*) defines all users and at wheel (@wheel) defines only users in group wheel. Make sure to add your administrative users into the wheel group (this can be done in /etc/group).

Finally edit the /etc/profile file and change the following line:

ulimit -c 1000000

to read:

ulimit -S -c 1000000 > /dev/null 2<&1

This modification is used to avoid getting error messages like 'Unable to reach limit' during login. On newer editions of Red Hat Linux, the later ulimit setting is default.

Further reading is available in The Linux-PAM System Administrators' Guide located at:
http://www.kernel.org/pub/linux/libs/pam/L...M-html/pam.html