Java开发网 - JDK1.5新特性之Generics

Topic: JDK1.5新特性之Generics

1.JDK1.5新特性之Generics

Posted by: rogertu
Posted on: 2008-06-05 00:04

摘自My Document: http://docs.google.com/Doc?id=dxtrs83_5dk7cs3gz

1 直观印象
在JDK1.5之前的版本中，对于一个Collection类库中的容器类实例，可将任意类型对象加入其中（都被当作Object实例看待）；从容器中取出的对象也只是一个Object实例，需要将其强制转型为期待的类型，这种强制转型的运行时正确性由程序员自行保证。
例如以下代码片断：

List intList = new ArrayList(); //创建一个List，准备存放一些Integer实例
intList.add(new Integer(0));
intList.add(“1”);//不小心加入了一个字符串;但在编译和运行时都不报错,只有仔细的代码Review能揪出
Integer i0 = (Integer)intList.get(0);
Integer i1 = (Integer)intList.get(1); //编译通过，直到运行时才抛ClassCastException

而在JDK1.5中，可以创建一个明确只能存放某种特定类型对象的容器类实例，例如如下代码：

List<Integer> intList = new ArrayList<Integer>(); //intList为存放Integer实例的List
intList.add(new Integer(0));
Integer i0 = intList.get(0); //无需(Integer)强制转型；List<Integer>的get()返回的就是Integer
intList.add(“1”); //编译不通过，因为List<Integer>的add()方法只接受Integer类型的参数

“List<Integer> intList = new ArrayList<Integer>();”就是最简单且最常用的Generic应用；显然，运用Generic后的代码更具可读性和健壮性。

2 Generic类
JDK1.5中Collection类库的大部分类都被改进为Generic类。以下是从JDK1.5源码中截取的关于List和Iterator接口定义的代码片断：
public interface List<E> {
void add(E x);
Iterator<E> iterator;
}

public interface Iterator<E> {
E next();
boolean hasNext();
}

以List为例，“public interface List<E>”中的E是List的类型参数，用户在使用List时可为类型参数指定一个确定类型值（如List<Integer>）。类型值为Java编译器所用，确保用户代码类型安全。例如，编译器知道List<Integer>的add()方法只接受Integer类型的参数，因此如果你在代码中将一个字符串传入add()将导致编译错误；编译器知道Iterator<Integer>的next()方法返回一个Integer的实例，你在代码中也就无需对返回结果进行(Integer)强制转型。代码检验通过（语法正确且不会导致运行时类型安全问题）后，编译器对现有代码有一个转换工作。简单的说，就是去除代码中的类型值信息，在必要处添加转型代码。例如如下代码：

public String demo() {
List<String> strList = new ArrayList<String>();
strList.add(“Hello!”);
return strList.get(0);
}

编译器在检验通过后，将其转换为：
public String demo() {
List strList = new ArrayList(); //去除类型值<String>
strList.add(“Hello!”);
return (String)strList.get(0); //添加转型动作代码(String)
}
可见，类型值信息只为Java编译器在编译时所用，确保代码无类型安全问题；验证通过之后，即被去除。对于JVM而言，只有如JDK1.5之前版本一样的List，并无List<Integer>和List<String>之分。这也就是Java Generics实现中关键技术Erasure的基本思想。以下代码在控制台输出的就是“true”。

List<String> strList = new ArrayList<String>();
List<Integer> intList = new ArrayList<Integer>();
System.out.println(strList.getClass() == intList.getClass());

可以将Generic理解为：为提高Java代码类型安全性（在编译时确保，而非等到运行时才暴露），Java代码与Java编译器之间新增的一种约定规范。Java编译器在编译结果*.class文件中供JVM读取的部分里没有保留Generic的任何信息；JVM看不到Generic的存在。

对于Generic类（设为GenericClass）的类型参数（设为T）：

1) 由于对于JVM而言，只有一个GenericClass类，所以GenericClass类的静态字段和静态方法的定义中不能使用T。T只能出现在GenericClass的非静态字段或非静态方法中。也即T是与GenericClass的实例相关的信息；

2) T只在编译时被编译器理解，因此也就不能与运行时被JVM理解并执行其代表的操作的操作符（如instanceof 和new）联用。

class GenericClass<T> {
T t1;
public void method1(T t){
t1 = new T(); //编译错误，T不能与new联用
if (t1 instanceof T) {}; //编译错误，T不能与instanceof联用
};

static T t2; //编译错误，静态字段不能使用T
public static void method2(T t){};//编译错误，静态方法不能使用T
}

Generic类可以有多个类型参数，且类型参数命名一般为大写单字符。例如Collection类库中的Map声明为：
public interface Map<K,V> {
……;
}

3 Generic类和原（Raw）类
对每一个Generic类，用户在使用时可以不指定类型参数。例如，对于List<E>，用户可以以“List<String> list;”方式使用，也可以以“List list;”方式使用。“List<String>”被称为参数化的Generic类（类型参数被赋值），而“List”称为原类。原类List的使用方式和效果与JDK1.5之前版本List的一样；使用原类也就失去了Generic带来的可读性和健壮性的增强。

允许原类使用方式的存在显然是为了代码的向前兼容：即JDK1.5之前的代码在JDK1.5下仍然编译通过且正常运行。

当你在JDK1.5中使用原类并向原类实例中添加对象时，编译器会产生警告，因为它无法保证待添加对象类型的正确性。编译通过是为了保证代码向前兼容，产生警告是提醒潜在的风险。

public void test () {
List list = new ArrayList();
list.add("tt");//JDK1.5编译器对此行产生警告
}

4 Generic类和子类
List<String> ls = new ArrayList<String>();
List<Object> lo = ls; //编译错误：Type mismatch: cannot convert from List<Dummy> to List<Object>
以上第二行代码导致的编译错误“Type mismatch: cannot convert from List<Dummy> to List<Object>”是不是有点出人意料？直观上看，就像String是Object的子类，因此‘Object o = “String”’合法一样，存放String的List是存放Object的List的子类，因此第二行应该是合法的。可以反过来分析，如果第二行是合法的，那么如下会导致运行时异常的代码也是合法的：
lo.add(new Object); //会导致在ls中添加了非String对象
String s = ls.get(0); //ls.get(0)返回的实际上只是一个Object实例，会导致ClassCastException
编译器显然不允许此种情形发生，因此不允许“List<Object> lo = ls”编译通过。

因此，直观上的“存放String的List是存放Object的List的子类”是错误的。更一般的说，设Foo是Bar的子类，G是Generic类型声明，G<Foo>不是G<Bar>的子类。

5 参数化的Generic类和数组
我们知道，如果T是S的子类，则T[]也是S[]的子类。因此，如下代码编译通过，只在运行时于第三行代码处抛ArrayStoreException。
String[] words = new String[10];
Object[] objects = words;
Objects[0] = new Object(); //编译通过，但运行时会抛ArrayStoreException

再分析如下代码：
List<String>[] wordLists = new ArrayList<String>[10];
ArrayList<Integer> integerList = new ArrayList<Integer>();
integerList.add(123);
Object[] objects = wordLists;
objects[0] = integerList;//运行时不出错，因为运行时ArrayList<String>和ArrayList<Integer> //都为ArrayList
String s = wordlists[0].get(0); //编译通过，运行时抛ClassCastException
就出现了“正确使用了Generic，但在运行时仍然出现ClassCastException”的情形。显然Java编译器不允许此种情形的发生。事实上，以上代码的第一行“List<String>[] wordLists = new ArrayList<String>[10];”就是编译不通过的，也就不存在接下来的代码。

更一般地说，不能创建参数化的Generic类的数组。

6 类型参数通配符?
由“Generic类和子类”节知，Collection<Object>不是存放其它类型对象的Collection（例如Collection<String>）的基类（抽象），那么如何表示任一种参数化的Collection的呢？使用“Collection<?>”。“？”即代表任一类型参数值。例如，我们可以很容易写出下面的通用函数printCollection()：
public static void printCollection(Collection<?> c) {
//如此遍历Collection的简洁方式也是JDK1.5新引入的
for (Object o : c) {
System.out.println Clock

;
}
}

这样，既可以将Collection<String>的实例，也可以将Collection<Integer>的实例作为参数调用printCollection()方法。
然而，要注意一点，你不能往Collection<?>容器实例中加入任何非null元素，例如如下代码的第三行编译不通过：

public static void testAdd(Collection<?> c) {
c.add(null); //编译通过
c.add(“test”); //编译错误
}

很好理解：c中要存放的对象的具体类型不确定，编译器无法验证待添加对象类型的正确性，因此不能加入对象实例；而null可以看作是任一个类的实例，因而允许加入。

另外，尽管c中要存放的对象的类型不确定，但我们知道任何类都是Object子类，因此从c中取出的对象都统一作为Object实例。

更进一步，如果BaseClass代表某个可被继承的类的类名，那么Collection<? extends BaseClass>代表类型参数值为BaseClass或BaseClass某个子类的任一参数化Collection。对于Collection<? extends BaseClass>的实例c，因为c中要存放的对象具体类型不确定，不能往其加入非null对象，但从c中取出的对象都统一作为BaseClass实例。事实上，你可以把Collection<?>看作Collection<? extends Object>的简洁形式。

另一种情形：如果SubClass代表任一个类的类名，那么Collection<? super SubClass>代表类型参数值为SubClass或SubClass某个祖先类的任一参数化Collection。对于Collection<? super SubClass>的实例c，你可以将SubClass实例加入其中，但从中取出的对象都是Object实例。

7 Generic方法
我们可以定义Generic类，同样可以定义Generic方法，即将方法的一个或多个参数的类型参数化，如代码：
public static <T> void fromArrayToCollection(T[] a, Collection<T> c) {
for (T o : a) {
c.add Clock

; //合法。注意与Collection<?>的区别
}

}

我们可以以如下方式调用fromArrayToCollection()：

Object[] oa = new Object[100];
Collection<Object> co = new ArrayList<Object>();
fromArrayToCollection(oa, co); //此时，T即为Object
String[] sa = new String[100];
Collection<String> cs = new ArrayList<String>();
fromArrayToCollection(sa, cs); //此时，T即为String
fromArrayToCollection(sa, co); //此时，T即为Object
Integer[] ia = new Integer[100];
Float[] fa = new Float[100];
Number[] na = new Number[100];
Collection<Number> cn = new ArrayList<Number>();
fromArrayToCollection(ia, cn); //此时，T即为Number
fromArrayToCollection(fa, cn); //此时，T即为Number
fromArrayToCollection(na, cn); //此时，T即为Number
fromArrayToCollection(na, co); //此时，T即为Object
fromArrayToCollection(na, cs); //编译错误

通过以上代码可以看出，我们在调用fromArrayToCollection()时，无需明确指定T为何种类型（与Generic类的使用方式不同），而是像调用一般method一样，直接提供参数值，编译器会根据提供的参数值自动为T赋类型值或提示编译错误（参数值不当）。

考虑如下函数sum()

public static long sum(Collection<? extends Number> numbers) {
long sum = 0;
for (Number n : numbers) {
sum += n.longValue();
}
return sum;
}
我们也可以将其以Generic方法实现：
public static <T extends Number> long sum(Collection<T> numbers) {
long sum = 0;
for (Number n : numbers) {
sum += n.longValue();
}
return sum;
}

那么对于一个方法，当要求参数类型可变时，是采用Generic方法，还是采用类型参数通配符方式呢？一般而言，如果参数类型间或参数类型与返回值类型间存在某种依赖关系，则采取Generic方法，否则采取类型参数通配符方式。

这一原则在Collection类库的源代码中得到了很好的体现，例如Collection接口的containsAll()、addAll()和toArray()方法：
interface Collection<E> {
public boolean containsAll(Collecion<?> c); //参数间类型以及参数与返回
//值间类型无依赖
<T> T[] toArray(T[] a); //参数a与返回值都是相同类的数组，有依赖
}

当然，根据需要，二者也可以结合使用，例如Collections中的copy()方法：
class Collections {
public static <T> void copy(List<T> dest, List<? extends T> src) {
…….
}
}

摘自My Document: http://docs.google.com/Doc?id=dxtrs83_5dk7cs3gz

2.Re:JDK1.5新特性之Generics [Re: rogertu]

Copy to clipboard

Posted by: JiafanZhou
Posted on: 2008-06-05 16:25

rogertu wrote:
这也就是Java Generics实现中关键技术Erasure的基本思想

What does "Erasure" mean here?

rogertu wrote:
允许原类使用方式的存在显然是为了代码的向前兼容：即JDK1.5之前的代码在JDK1.5下仍然编译通过且正常运行。

This is *not* true. It is only one of the reason to allow raw type. The main purpose is to facilitate both row type and generic type when programming Java code. Otherwise It will be painful to write everything in generics.

rogertu wrote:
我们知道，如果T是S的子类，则T[]也是S[]的子类。
String[] words = new String[10];
Object[] objects = words;
Objects[0] = new Object();

I don't think this is correct. We only use inheritance to describe the object model, but never its array type. So we cannot say t[] is a subtype of S[]. They are arrays, how could array have any inheritance? The code above only fool the compiler and will end up with a runtime exception.

rogertu wrote:
由“Generic类和子类”节知，Collection<Object>不是存放其它类型对象的Collection（例如Collection<String>）的基类（抽象），那么如何表示任一种参数化的Collection的呢？

I have read this sentence for about 4 times and I have not a clue what does it mean. It is very bad translated into Chinese. It should be translated as:
Collection<Object>只能存放类型为Object的对象，而不能存放任何Object子类类型的对象。

rogertu wrote:
另一种情形：如果SubClass代表任一个类的类名，那么Collection<? super SubClass>代表类型参数值为SubClass或SubClass某个祖先类的任一参数化Collection。对于 Collection<? super SubClass>的实例c，你可以将SubClass实例加入其中，但从中取出的对象都是Object实例。

Again, not very intuitive, an example here will be very very helpful.

rogertu wrote:
public static long sum(Collection<? extends Number> numbers) {
long sum = 0;
for (Number n : numbers) {
sum += n.longValue();
}
return sum;
}
我们也可以将其以Generic方法实现：
public static <T extends Number> long sum(Collection<T> numbers) {
long sum = 0;
for (Number n : numbers) {
sum += n.longValue();
}
return sum;
}

This is a good part. I like it.