DTREG uses the conjugate gradient a

DTREG uses the conjugate gradient algorithm to adjust weight values using the gradient during the backward propagation of errors through the network. Compared to gradient descent, the conjugate gradient algorithm takes a more direct path to the optimal set of weight values. Usually, conjugate gradient is significantly faster and more robust than gradient descent. Conjugate gradient also does not require the user to specify learning rate and momentum parameters.
The traditional conjugate gradient algorithm uses the gradient to compute a search direction. It then uses a line search algorithm such as Brent’s Method to find the optimal step size along a line in the search direction. The line search avoids the need to compute the Hessian matrix of second derivatives, but it requires computing the error at multiple points along the line. The conjugate gradient algorithm with line search (CGL) has been used successfully in many neural network programs, and is considered one of the best methods yet invented.
DTREG provides the traditional conjugate gradient algorithm with line search, but it also offers a newer algorithm, Scaled Conjugate Gradient (see Moller, 1993).
The scaled conjugate gradient algorithm uses a numerical approximation for the second derivatives (Hessian matrix), but it avoids instability by combining the model-trust region approach from the Levenberg-Marquardt algorithm with the conjugate gradient approach. This allows scaled conjugate gradient to compute the optimal step size in the search direction without having to perform the computationally expensive line search used by the traditional conjugate gradient algorithm. Of course, there is a cost involved in estimating the second derivatives.
Tests performed by Moller show the scaled conjugate gradient algorithm converging up to twice as fast as traditional conjugate gradient and up to 20 times as fast as backpropagation using gradient descent. Moller’s tests also showed that scaled conjugate gradient failed to converge less often than traditional conjugate gradient or backpropagation using gradient descent.
Avoiding Over fitting
“Over fitting” occurs when the parameters of a model are tuned so tightly that the model fits the training data well but has poor accuracy on separate data not used for training. Multilayer perceptrons are subject to over fitting as are most other types of models.
DTREG has two methods for dealing with over fitting: (1) by selecting the optimal number of neurons as described above, and (2) by evaluating the model as the parameters are being tuned and stopping the tuning when over fitting is detected. This is known as “early stopping”.
If you enable the early-stopping option, DTREG holds out a specified percentage of the training rows and uses them to check for over fitting as model tuning is performed. The tuning process uses the training data to search for optimal parameter values. But as this process is running, the model is evaluated on the hold-out test rows, and the error from that test is compared with the error computed using previous parameter values. If the error on the test rows does not decrease after a specified number of iterations then DTREG stops the training and uses the parameters which produced the lowest error on the test data.
See page 67 for information about setting the parameters for the conjugate gradient algorithm.

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

DTREG uses the conjugate gradient algorithm to adjust weight values using the gradient during the backward propagation of errors through the network. Compared to gradient descent, the conjugate gradient algorithm takes a more direct path to the optimal set of weight values. Usually, conjugate gradient is significantly faster and more robust than gradient descent. Conjugate gradient also does not require the user to specify learning rate and momentum parameters.The traditional conjugate gradient algorithm uses the gradient to compute a search direction. It then uses a line search algorithm such as Brent’s Method to find the optimal step size along a line in the search direction. The line search avoids the need to compute the Hessian matrix of second derivatives, but it requires computing the error at multiple points along the line. The conjugate gradient algorithm with line search (CGL) has been used successfully in many neural network programs, and is considered one of the best methods yet invented.DTREG provides the traditional conjugate gradient algorithm with line search, but it also offers a newer algorithm, Scaled Conjugate Gradient (see Moller, 1993).The scaled conjugate gradient algorithm uses a numerical approximation for the second derivatives (Hessian matrix), but it avoids instability by combining the model-trust region approach from the Levenberg-Marquardt algorithm with the conjugate gradient approach. This allows scaled conjugate gradient to compute the optimal step size in the search direction without having to perform the computationally expensive line search used by the traditional conjugate gradient algorithm. Of course, there is a cost involved in estimating the second derivatives.Tests performed by Moller show the scaled conjugate gradient algorithm converging up to twice as fast as traditional conjugate gradient and up to 20 times as fast as backpropagation using gradient descent. Moller’s tests also showed that scaled conjugate gradient failed to converge less often than traditional conjugate gradient or backpropagation using gradient descent.
Avoiding Over fitting
“Over fitting” occurs when the parameters of a model are tuned so tightly that the model fits the training data well but has poor accuracy on separate data not used for training. Multilayer perceptrons are subject to over fitting as are most other types of models.
DTREG has two methods for dealing with over fitting: (1) by selecting the optimal number of neurons as described above, and (2) by evaluating the model as the parameters are being tuned and stopping the tuning when over fitting is detected. This is known as “early stopping”.
If you enable the early-stopping option, DTREG holds out a specified percentage of the training rows and uses them to check for over fitting as model tuning is performed. The tuning process uses the training data to search for optimal parameter values. But as this process is running, the model is evaluated on the hold-out test rows, and the error from that test is compared with the error computed using previous parameter values. If the error on the test rows does not decrease after a specified number of iterations then DTREG stops the training and uses the parameters which produced the lowest error on the test data.
See page 67 for information about setting the parameters for the conjugate gradient algorithm.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

DTREG sử dụng thuật toán dốc liên hợp để điều chỉnh giá trị khối lượng sử dụng gradient trong công tác tuyên truyền lạc hậu lỗi thông qua mạng. So với gốc gradient, các thuật toán liên hợp Gradient có một đường dẫn trực tiếp đến các thiết lập tối ưu các giá trị cân. Thông thường, liên hợp gradient là đáng kể nhanh hơn và mạnh hơn so với gốc gradient. Conjugate Gradient cũng không yêu cầu người dùng xác định học tập các thông số tốc độ và động lực.
Các thuật toán liên hợp Gradient truyền thống sử dụng gradient để tính toán một hướng tìm kiếm. Sau đó sử dụng một dòng thuật toán tìm kiếm như Phương pháp Brent để tìm các bước tối ưu kích thước dọc theo một đường theo hướng tìm kiếm. Việc tìm kiếm đường tránh sự cần thiết để tính toán ma trận Hessian của các dẫn xuất thứ hai, nhưng nó đòi hỏi tính toán lỗi tại nhiều điểm dọc theo đường. Các thuật toán conjugate gradient với tìm kiếm dòng (CGL) đã được sử dụng thành công trong nhiều chương trình mạng lưới thần kinh, và được coi là một trong những phương pháp tốt nhất chưa phát minh ra.
DTREG cung cấp các thuật toán liên hợp Gradient truyền thống với tìm kiếm dòng, nhưng nó cũng cung cấp một thuật toán mới hơn, Scaled Conjugate Gradient (xem Moller, 1993).
Các thuật toán quy mô liên hợp Gradient sử dụng một xấp xỉ số cho hàm bậc hai (ma trận Hessian), nhưng nó tránh được sự bất ổn bằng cách kết hợp các phương pháp tiếp cận khu vực mô hình độc quyền từ các thuật toán Levenberg-Marquardt với gradient liên hợp phương pháp tiếp cận. Điều này cho phép quy mô liên hợp gradient để tính toán các bước kích thước tối ưu theo hướng tìm kiếm mà không cần phải thực hiện tìm kiếm dòng tính toán đắt tiền được sử dụng bởi các thuật toán liên hợp Gradient truyền thống. Tất nhiên, có một chi phí liên quan trong việc ước tính các dẫn xuất thứ hai.
Các thử nghiệm được thực hiện bởi Moller hiển thị liên hợp thuật toán Gradient thu nhỏ hội tụ lên đến hai lần nhanh như Gradient liên hợp truyền thống và lên đến 20 lần nhanh như lan truyền ngược bằng cách sử dụng gốc gradient. Kiểm tra Moller cũng cho thấy quy mô liên hợp Gradient thất bại trong việc hội tụ ít thường xuyên hơn Gradient liên hợp truyền thống hoặc lan truyền ngược bằng cách sử dụng gốc gradient.
Tránh Trong phù hợp
"Trong phù hợp" xảy ra khi các thông số của một mô hình được điều chỉnh rất chặt chẽ rằng mô hình phù hợp với dữ liệu huấn luyện tốt, nhưng có độ chính xác kém trên dữ liệu riêng biệt không được sử dụng cho đào tạo. Multilayer perceptron có thể phù hợp hơn như là loại khác hầu hết các mô hình.
DTREG có hai phương pháp để đối phó với hơn phù hợp: (1) bằng cách chọn số lượng tối ưu của các tế bào thần kinh như mô tả ở trên, và (2) bằng cách đánh giá các mô hình như các thông số đang được điều chỉnh và ngăn chặn sự điều chỉnh phù hợp hơn khi được phát hiện. Điều này được gọi là "dừng lại sớm".
Nếu bạn kích hoạt tùy chọn đầu ngừng đập, DTREG giữ ra một tỷ lệ phần trăm nhất định của các hàng đào tạo và sử dụng chúng để kiểm tra cho phù hợp hơn như mô hình điều chỉnh được thực hiện. Quá trình điều chỉnh sử dụng dữ liệu huấn luyện để tìm kiếm các giá trị tham số tối ưu. Nhưng khi quá trình này đang chạy, các mô hình được đánh giá trên các hàng kiểm tra hold-out, và các lỗi từ bài kiểm tra đó được so sánh với các lỗi tính toán sử dụng giá trị tham số trước đó. Nếu lỗi trên các hàng kiểm tra không giảm sau một số lần lặp lại sau đó DTREG dừng lại việc đào tạo và sử dụng các thông số mà sản xuất các lỗi thấp nhất trên các dữ liệu thử nghiệm.
Xem trang 67 để biết thông tin về việc thiết lập các thông số cho các thuật toán liên hợp gradient.

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.