From Fedora Project Wiki

< QA‎ | Networking

(Host with global IPv4 (and DNS))
(Problem statement)
(8 intermediate revisions by the same user not shown)
Line 33: Line 33:
 
and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when
 
and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when
 
it's used offline.
 
it's used offline.
 +
 +
The actual getaddrinfo() implementation differs from all available specifications including its own manual page.
  
 
=== Tests ===
 
=== Tests ===
Line 80: Line 82:
  
 
Actual result: Same '''as desired''' result, different from documented result.
 
Actual result: Same '''as desired''' result, different from documented result.
 +
 +
Broken addresses: None (127.0.0.1, ::1 according to documentation).
  
 
==== Host with 127.0.0.1, ::1 and at least one link-local IPv6 address ====
 
==== Host with 127.0.0.1, ::1 and at least one link-local IPv6 address ====
Line 88: Line 92:
  
 
Actual result: Same '''as documented''' result, different from desired result.
 
Actual result: Same '''as documented''' result, different from desired result.
 +
 +
Broken addresses: 127.0.0.1
  
 
==== Host with global IPv4, link-local IPv6 (and DNS) ====
 
==== Host with global IPv4, link-local IPv6 (and DNS) ====
 +
 +
Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.
 +
 +
Document result: Unlimited address resolution (like without AI_ADDRCONFIG).
 +
 +
Actual result: Same '''as documented''', different from desired.
 +
 +
==== Host with global IPv4 (and DNS), ''without'' link-local IPv6 (like non-ethernet links) ====
  
 
Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.
 
Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.
Line 95: Line 109:
 
Document result: Only IPv4 addresses should work. Both non-DNS and DNS names should only give IPv4 addresses.
 
Document result: Only IPv4 addresses should work. Both non-DNS and DNS names should only give IPv4 addresses.
  
Actual result: '''Different from both''' desired and documented results. All addresses work. All addresses for both non-DNS and DNS names are returned.
+
Actual result: Same '''as documented''', different from desired.
 +
 
 +
Broken addresses: ::1
  
 
==== Host with global IPv6 (and DNS) ====
 
==== Host with global IPv6 (and DNS) ====
Line 104: Line 120:
  
 
Actual result: Same '''as documented''' result, different from desired result.
 
Actual result: Same '''as documented''' result, different from desired result.
 +
 +
Broken addresses: 127.0.0.1
  
 
==== Host with both IPv4 and IPv6 addresses (and DNS, of course) ====
 
==== Host with both IPv4 and IPv6 addresses (and DNS, of course) ====

Revision as of 01:12, 8 December 2012

Flag AI_ADDRCONFIG considered harmful

As far as I know, AI_ADDRCONFIG was added for the following reasons:

  • Some buggy DNS servers would be confused by AAAA requests
  • Optimization of the number DNS queries

Currently, I'm aware of several documents that define AI_ADDRCONFIG:

  • POSIX1-2008: useless but harmless
  • RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost
  • RFC 2553 (obsolete informational): useless but hopefully harmless
  • GLIBC getaddrinfo(3): like RFC 3493

Actual GLIBC getaddrinfo() behavior differs from the manual page.

Problem statement

Currently, any of the definitions above prevents AI_ADDRCONFIG from filtering out IPv6 addresses when a link-local IPv6 address is present. These addresses are automatically added to interfaces that are otherwise only configured for IPv4. Therefore, on a typical linux system, AI_ADDRCONFIG cannot meet its goals and is effectively useless.

But it builds on a false assumption, that no IPv4 communication is feasible without a non-loopback address. But why would we have a loopback address if we can't use it for node-local communication? AI_ADDRCONFIG breaks localhost, localhost4, localhost6, 127.0.0.1, ::1 and more if there's no non-loopback address of the respective protocol.

This can happen if the computer is connected to an IPv4-only network or and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when it's used offline.

The actual getaddrinfo() implementation differs from all available specifications including its own manual page.

Tests

Tested with glibc 2.16.0.

#!/usr/bin/python3
import sys
from socket import *
hosts = [
    None,
    "localhost",
    "127.0.0.1",
    "localhost4",
    "::1",
    "localhost6",
    "195.47.235.3",
    "2a02:38::1001",
    "info.nix.cz",
    "www.google.com",
]
for host in hosts:
    print("getaddrinfo host=\"{}\" hints.ai_flags=AI_ADDRCONFIG:".format(host))
    try:
        for item in getaddrinfo(host, "http", AF_UNSPEC, SOCK_STREAM, SOL_TCP, AI_ADDRCONFIG):
            print("  {}".format(item[4][0]))
    except gaierror as error:
    	print("  !! {} !!".format(error))

The desired result may not be well defined in this case. For now I'm using a simple definition that says:

1) Don't break non-DNS results. You never know when you need them.

2) Filter DNS results based on the presence of global IPv4 and global IPv6 addresses (with a simplified definition of global that means not node-local and not link-local).

Feel free to offer better definitions of what constitutes a desired result.

The documented result is what follows from the manual page. Note that the definition of getaddrinfo() is roughly the same as RFC 3493 but substantially different from POSIX1-2008.

Host with only 127.0.0.1 and ::1 names

Desired result: All addresses and all non-DNS names should work.

Documented result: Nothing should work.

Actual result: Same as desired result, different from documented result.

Broken addresses: None (127.0.0.1, ::1 according to documentation).

Host with 127.0.0.1, ::1 and at least one link-local IPv6 address

Desired result: All addresses and all non-DNS names should work.

Documented result: Only IPv6 addresses should work. Non-DNS names should only give IPv6 addresses.

Actual result: Same as documented result, different from desired result.

Broken addresses: 127.0.0.1

Host with global IPv4, link-local IPv6 (and DNS)

Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.

Document result: Unlimited address resolution (like without AI_ADDRCONFIG).

Actual result: Same as documented, different from desired.

Host with global IPv4 (and DNS), without link-local IPv6 (like non-ethernet links)

Desired result: All addresses and all non-DNS names should work. DNS names should only give IPv4 addresses.

Document result: Only IPv4 addresses should work. Both non-DNS and DNS names should only give IPv4 addresses.

Actual result: Same as documented, different from desired.

Broken addresses: ::1

Host with global IPv6 (and DNS)

Desired result: All addresses and all non-DNS names should work. DNS name should only give IPv6 addresses.

Documented result: Only IPv6 addresses should work. Both non-DNS and DNS names should only give IPv6 addresses

Actual result: Same as documented result, different from desired result.

Broken addresses: 127.0.0.1

Host with both IPv4 and IPv6 addresses (and DNS, of course)

Desired and documented result: Unlimited address resolution (like without AI_ADDRCONFIG).

Actual result: Same as desired and documented. Everything works.

Making AI_ADDRCONFIG useful

A possible solution for the first problem (that AI_ADDRCONFIG is useless) is to treat link-local addresses the same as loopback (or node-local) addresses. But this is even more harmful.

Fedora's GLIBC was patched to do exactly the above thing. The consequence was that even link-local IPv6 stopped working when a global IPv6 address was absent. And what would we have link-local addresses for if they didn't work without global addresses? This patch has been already reverted.

Conclusion

The whole idea of filtering-out non-DNS addresses is flawed and breaks so many things including IPv4 and IPv6 literals. There is no reason to filter them out.

Proposed solutions:

1) Make getaddrinfo() ignore AI_ADDRCONFIG. It has not been working for years and nobody cared enough to fix it, there is a substantial probability that it's not needed. Remove the code that implements it (patch).

1b) Make getaddrinfo() ignre AI_ADDRCONFIG only when filtering the results but keeps its behavior for gethostbyname* function selection which affects DNS results. The resulting behavior is something between #1 and #3.

2) Patch all software to avoid using AI_ADDRCONFIG. Follow new development, and prevent/reject modifications that add it. This is impractical.

3) Only process AI_ADDRCONFIG in the nsswitch DNS plugin. This requires implementing getaddrinfo() in nsswitch which is required for zeroconf networking anyway. Use solution (1) as a temporary fix. Locally assigned addresses looked up through local DNS would still fail.

Notes: Solution #2 is advocated by Michal Kubeček from SUSE. The third solution is an output of long discussions between me (Pavel Šimerda) and Tore Anderson, who explained me the original purpose of AI_ADDRCONFIG. I would have no problem with just doing #1.

More resources:

Examples of software using AI_ADDRCONFIG