Line 193: | Line 193: | ||
needed. [http://sourceware.org/bugzilla/attachment.cgi?id=6648&action=diff Remove the code that implements it.] | needed. [http://sourceware.org/bugzilla/attachment.cgi?id=6648&action=diff Remove the code that implements it.] | ||
2) Patch software to avoid using AI_ADDRCONFIG | 2) Patch software to avoid using AI_ADDRCONFIG. Follow new development, and | ||
prevent/reject modifications that add it. | |||
3) Only process AI_ADDRCONFIG in the nsswitch DNS plugin. This requires | 3) Only process AI_ADDRCONFIG in the nsswitch DNS plugin. This requires |
Revision as of 15:16, 23 November 2012
Name resolution
Resolving using getaddrinfo()
in applications
The getaddrinfo()
function is a dualstack-friendly API to name
resolution. It is used by applications to translate host and
service names to a linked list of struct addrinfo
objects. It has its own manual page getaddrinfo(3)
in the Linux
Programmer's Manual.
Running getaddrinfo()
And example of getaddrinfo()
call:
const char *node = "www.fedoraproject.org"; const char *service = "http"; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_DGRAM, .ai_flags = 0, .ai_protocol = 0, .ai_canonname = NULL, .ai_addr = NULL, .ai_next = NULL }; struct addrinfo *result; int error; error = getaddrinfo(node, service, &hints, &result);
The input of getaddrinfo() consists of node specification, service specification and further hints.
- node: literal IPv4 or IPv6 address, or a hostname to be resolved
- service: numeric port number or a symbolic service name
- hints.ai_family: enable dualprotocol, IPv4-only or IPv6-only queries
- hints.ai_socktype: select socket type (and thus protocol family)
getaddrinfo()
can be futher tweaked with the hints.ai_flags. Other
attributes are either not needed (ai_protocol) or not supposed
to be set in hints (ai_canonname, ai_addr and ai_next).
On success, the error variable is assigned to 0 and result is pointed to
a linked list of one or more struct addrinfo
objects.
Never assume that getaddrinfo() returns only one result or that the first result actually works!
Using getaddrinfo()
results
It is necesary to try all results until one successfully connects. This works perfectly for TCP connections as they can fail gracefully at this stage.
struct addrinfo *item; int sock; for (item = result; item; item = item->ai_next) { sock = socket(item->ai_family, item->ai_socktype, item->ai_protocol); if (sock == -1) continue; if (connect(sock, item->ai_addr, item->ai_addrlen) != -1) { fprintf(stderr, "Connected successfully."); break; } close(sock); }
For UDP, connect()
succeeds without contacting the other side (if you
are using connect()
with udp at all). Therefore you might want to
perform additional actions (such as sending a message and recieving a reply)
before crying out „success!“.
Freeing getaddrinfo()
results
When we're done with the results, we'll free the linked list.
freeaddrinfo(result);
Using getaddrinfo()
in Python
Python's socket.getaddrinfo()
API tries to be
a little bit more sane than the C API.
#!/usr/bin/python3 import sys, socket host = "www.fedoraproject.org" service = "http" family = socket.AF_UNSPEC socktype = socket.SOCK_DGRAM protocol = 0 flags = 0 result = socket.getaddrinfo(host, service, family, socktype, protocol, flags) sock = None for family, socktype, protocol, canonname, sockaddr in result: try: sock = socket.socket(family, socktype, protocol) except socket.error: continue try: sock.connect(sockaddr) print("Successfully connected to: {}".format(sockaddr)) except socket.error: sock.close() sock = None continue break if sock is None: print("Failed to connect.", file=sys.stderr) sys.exit(1)
Tweaking getaddrinfo()
flags
- AI_NUMERICHOST: use literal address, don't perform host resolution
- AI_PASSIVE: return socket addresses suitable for bind() instead of connect(), sendto() and sendmsg()
- AI_NUMERICSERV: use numeric service, don't perform service resolution
- AI_CANONNAME: save canonical name to the first result
- AI_ADDRCONFIG: this never really worked, as far as I know
- AI_V4MAPPED+AI_ALL: only with AF_INET6, return IPv4 addresses mapped into IPv6 space
- AI_V4MAPPED: I don't see any real use for this, only returns mapped IPv4 if there are no IPv6 addresses
Flag AI_ADDRCONFIG considered harmful
As far as I know, AI_ADDRCONFIG was added for the following reasons:
- Some buggy DNS servers would be confused by AAAA requests
- Optimization of the number DNS queries
Currently, I'm aware of several documents that define AI_ADDRCONFIG:
- POSIX1-2008: useless but harmless
- RFC 3493 (informational): useless but (partially) breaks IPv4/IPv6 localhost
- RFC 2553 (obsolete informational): useless but hopefully harmless
- GLIBC getaddrinfo(3): like RFC 3493
Actual GLIBC getaddrinfo()
behavior differs from the manual
page.
Problem statement
Currently, any of the definitions above prevents AI_ADDRCONFIG from filtering out IPv6 addresses when a link-local IPv6 address is present. These addresses are automatically added to interfaces that are otherwise only configured for IPv4. Therefore, on a typical linux system, AI_ADDRCONFIG cannot meet its goals and is effectively useless.
But it builds on a false assumption, that no IPv4 communication is feasible without a non-loopback address. But why would we have a loopback address if we can't use it for node-local communication? AI_ADDRCONFIG breaks localhost, localhost4, localhost6, 127.0.0.1, ::1 and more if there's no non-loopback address of the respective protocol.
This can happen if the computer is connected to an IPv4-only network or and IPv6-only network, when it loses IPv4 or IPv6 connectivity and when it's used offline.
Making AI_ADDRCONFIG useful
A possible solution for the first problem (that AI_ADDRCONFIG is useless) is to treat link-local addresses the same as loopback (or node-local) addresses. But this is even more harmful.
Fedora's GLIBC was patched to do exactly the above thing. The consequence was that even link-local IPv6 stopped working when a global IPv6 address was absent. And what would we have link-local addresses for if they didn't work without global addresses? This patch has been already reverted.
Conclusion
The whole idea of filtering-out non-DNS addresses is flawed and breaks so many things including IPv4 and IPv6 literals. There is no reason to filter them out.
Proposed solutions:
1) Ignore AI_ADDRCONFIG. As it has not been working for years and nobody cared enough to fix it, there is a substantial probability that it's not needed. Remove the code that implements it.
2) Patch software to avoid using AI_ADDRCONFIG. Follow new development, and prevent/reject modifications that add it.
3) Only process AI_ADDRCONFIG in the nsswitch DNS plugin. This requires
implementing getaddrinfo()
in nsswitch which is required
for zeroconf networking anyway. Use solution (1) as a temporary fix.
The first solution is trivial. The second is rather political. And the third solution has been so far the most acceptable. It is an output of long discussions between me (Pavel Šimerda) and Tore Anderson, who explained me the original purpose of AI_ADDRCONFIG.
More resources:
- IPv4:
getaddrinfo("127.0.0.1", ...)
fail with some AI_ADDRCONFIG configurations - IPv6: Fedora 808147 -
getaddrinfo("::1", ...)
fails with some configurations of AI_ADDRCONFIG - IPv6:
getaddrinfo("fe80::1234:56ff:fe78:90%eth0", ...)
also fails as above - IPv6: GLIBC's nsswitch doesn't support overriding
getaddrinfo
which is requred to resolve link-local IPv6 addresses
Comments and discussion
Please send any remarks and questions to psimerda-at-redhat-dot-com or use Talk:Networking/NameResolution. Edit with care.