From Fedora Project Wiki

< QA‎ | Networking

Revision as of 00:45, 8 December 2012 by Pavlix (talk | contribs) (→‎Tests)

Flag AI_ADDRCONFIG considered harmful

As far as I know, AI_ADDRCONFIG was added for the following reasons:

  • Some buggy DNS servers would be confused by AAAA requests
  • Optimization of the number DNS queries

Currently, I'm aware of several documents that define AI_ADDRCONFIG:

  • POSIX1-2008: useless but harmless
  • RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost
  • RFC 2553 (obsolete informational): useless but hopefully harmless
  • GLIBC getaddrinfo(3): like RFC 3493

Actual GLIBC getaddrinfo() behavior differs from the manual page.

Problem statement

Currently, any of the definitions above prevents AI_ADDRCONFIG from filtering out IPv6 addresses when a link-local IPv6 address is present. These addresses are automatically added to interfaces that are otherwise only configured for IPv4. Therefore, on a typical linux system, AI_ADDRCONFIG cannot meet its goals and is effectively useless.

But it builds on a false assumption, that no IPv4 communication is feasible without a non-loopback address. But why would we have a loopback address if we can't use it for node-local communication? AI_ADDRCONFIG breaks localhost, localhost4, localhost6, 127.0.0.1, ::1 and more if there's no non-loopback address of the respective protocol.

This can happen if the computer is connected to an IPv4-only network or and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when it's used offline.

Tests

Tested with glibc 2.16.0.

#!/usr/bin/python3
import sys
from socket import *
hosts = [
    None,
    "localhost",
    "127.0.0.1",
    "localhost4",
    "::1",
    "localhost6",
    "195.47.235.3",
    "2a02:38::1001",
    "info.nix.cz",
    "www.google.com",
]
for host in hosts:
    print("getaddrinfo host=\"{}\" hints.ai_flags=AI_ADDRCONFIG:".format(host))
    try:
        for item in getaddrinfo(host, "http", AF_UNSPEC, SOCK_STREAM, SOL_TCP, AI_ADDRCONFIG):
            print("  {}".format(item[4][0]))
    except gaierror as error:
    	print("  !! {} !!".format(error))

The desired result may not be well defined in this case. For now I'm using a simple definition that says:

1) Don't break non-DNS results. You never know when you need them.

2) Filter DNS results based on the presence of global IPv4 and global IPv6 addresses (with a simplified definition of global that means not node-local and not link-local).

Feel free to offer better definitions of what constitutes a desired result.

The documented result is what follows from the manual page. Note that the definition of getaddrinfo() is roughly the same as RFC 3493 but substantially different from POSIX1-2008.

Host with only 127.0.0.1 and ::1 names

Desired result: All addresses and all non-DNS names should work.

Documented result: Nothing should work.

Actual result: Same as desired result, different from documented result.

Host with 127.0.0.1, ::1 and at least one link-local IPv6 address

Desired result: All addresses and all non-DNS names should work.

Documented result: Only IPv6 addresses should work. Non-DNS names should only give IPv6 addresses.

Actual result: Same as documented result, different from desired result.

Host with global IPv4 (and DNS)

Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.

Document result: Only IPv4 addresses should work. Both non-DNS and DNS names should only give IPv4 addresses.

Actual result: Different from both desired and documented results. All addresses work. All addresses for both non-DNS and DNS names are returned.

Host with global IPv6 (and DNS)

Desired result: All addresses and all non-DNS names should work. DNS name should only give IPv6 addresses.

Documented result: Only IPv6 addresses should work. Both non-DNS and DNS names should only give IPv6 addresses

Actual result: Same as documented result, different from desired result.

Host with both IPv4 and IPv6 addresses (and DNS, of course)

Desired and documented result: Unlimited address resolution (like without AI_ADDRCONFIG).

Actual result: Same as desired and documented. Everything works.

Making AI_ADDRCONFIG useful

A possible solution for the first problem (that AI_ADDRCONFIG is useless) is to treat link-local addresses the same as loopback (or node-local) addresses. But this is even more harmful.

Fedora's GLIBC was patched to do exactly the above thing. The consequence was that even link-local IPv6 stopped working when a global IPv6 address was absent. And what would we have link-local addresses for if they didn't work without global addresses? This patch has been already reverted.

Conclusion

The whole idea of filtering-out non-DNS addresses is flawed and breaks so many things including IPv4 and IPv6 literals. There is no reason to filter them out.

Proposed solutions:

1) Make getaddrinfo() ignore AI_ADDRCONFIG. It has not been working for years and nobody cared enough to fix it, there is a substantial probability that it's not needed. Remove the code that implements it (patch).

1b) Make getaddrinfo() ignre AI_ADDRCONFIG only when filtering the results but keeps its behavior for gethostbyname* function selection which affects DNS results. The resulting behavior is something between #1 and #3.

2) Patch all software to avoid using AI_ADDRCONFIG. Follow new development, and prevent/reject modifications that add it. This is impractical.

3) Only process AI_ADDRCONFIG in the nsswitch DNS plugin. This requires implementing getaddrinfo() in nsswitch which is required for zeroconf networking anyway. Use solution (1) as a temporary fix. Locally assigned addresses looked up through local DNS would still fail.

Notes: Solution #2 is advocated by Michal Kubeček from SUSE. The third solution is an output of long discussions between me (Pavel Šimerda) and Tore Anderson, who explained me the original purpose of AI_ADDRCONFIG. I would have no problem with just doing #1.

More resources:

Examples of software using AI_ADDRCONFIG